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MODIFIED HIV ENV POLYPEPTIDES 



Technical Field 

5 The invention relates generally to modified HIV envelope (Env) polypeptides which 

are useful as immunizing agents or for generating an immune response in a subject, for 
example a cellular immune response or a protective immune response. More particularly, the 
invention relates Env polypeptides such as gpl20, gpl40 or gpl60, wherein at least one of 
the native P-sheet configurations has been modified. The invention also pertains to methods 
10 of using these polypeptides to elicit an immune response against a broad range of HIV 
subtypes. 

Background of the Invention 

The human immunodeficiency virus (HIV-1, also referred to as HTLV-II1, LAV or 

1 5 HTLV-III/LAV) is the etiological agent of the acquired immune deficiency syndrome (AIDS) 
and related disorders, (see, e.g., Barre-Sinoussi, et al., (1983) Science 220:868-871 ; Gallo et 
al. (1984) Science 224:500-503; Levy et al., (1984) Science 225:840-842; Siegal et al., (1981) 
N. Engl. J. Med. 305:1439-1444). AIDS patients usually have a long asymptomatic period 
followed by the progressive degeneration of the immune system and the central nervous 

20 system. Replication of the virus is highly regulated, and both latent and lytic infection of the 
CD4 positive helper subset of T-lymphocytes occur in tissue culture (Zagury et al., (1986) 
Science 231:850-853). Molecular studies of HIV-1 show that it encodes a number of genes 
(Ratner et al., (1985) Nature 313:277-284; Sanchez-Pescador et al., (1985) Science 227:484- 
492), including three structural genes ~ gag, pol and env - that are common to all 

25 retroviruses. Nucleotide sequences from viral genomes of other retroviruses, particularly 

HIV-2 and simian immunodeficiency viruses, SIV (previously referred to as STLV-ITT), also 
contain these structural genes. (Guyader et al., (1987) Nature 326:662-669; Chakrabarti et 
al., (1987) Nature 

The envelope protein of HIV-1, HIV-2 and SIV is a glycoprotein of about 160 kd 
30 (gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gp!20 and the integral membrane protein, gp41. The gp41 portion is anchored in the 
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membrane bilayer of virion, while the gpl20 segment protrudes into the surrounding 
environment. gpl20 and gp41 are more covalently associated and free gpl20 can be released 
from the surface of virions and infected cells. 

As depicted in Figure 1, crystallography studies of the gpl20 core polypeptide 
5 indicate that this polypeptide is folded into two major domains having certain emanating 
structures. The inner domain (inner with respect to the N and C terminus) features a two- 
helix, two-stranded bundle with a small five-stranded P-sandwich at its termini-proximal end 
and a projection at the distal end from which the V 1/V2 stem emanates. The outer domain is 
a staked double barrel that lies along side the inner domain so that the outer barrel and inner 

10 bundle axes are approximately parallel. Between the distal inner domain and the distal outer 
domain is a four-stranded bridging sheet which holds a peculiar minidomain in contact with, 
but distinct from, the inner, the outer domain, and the VW2 domain. The bridging sheet is 
composed of four P-strand structures (p-3, P-2, P-21, P-20, shown in Figure 1). The bridging 
region can be seen in Figure 1 packing primarily over the inner domain, although some 

1 5 surface residues of the outer domain, such as Phe 382, reach into the bridging sheet to form 
part of its hydrophobic core. 

The basic unit of the P-sheet conformation of the bridging sheet region is the p-strand 
which exists as a less tightly coiled helix, with 2.0 residues per turn. The P-strand 
conformation is only stable when incorporated into a p-sheet, where hydrogen bonds with 

20 close to optimal geometry are formed between the peptide groups on adjacent P-strands; the 
dipole moments of the strands are also aligned favorably. Side chains from adjacent residues 
of the same strand protrude from opposite sides of the sheet and do not interact with each 
other, but have significant interactions with their backbone and with the side chains of 
neighboring strands. For a general description of p-sheets, see, e.g., T.E. Creighton, Proteins: 

25 Structures and Molecular Properties (W.H. Freeman and Company, 1993); and A.L. 
Lehninger, Biochemistry (Worth Publishers, Inc., 1975). 

The gpl20 polypeptide is instrumental in mediating entry into the host cell. Recent 
studies have indicated that binding of CD4 to gpl20 induces a conformational change in Env 
that allows for binding to a co-receptor {e.g, a chemokine receptor) and subsequent entry of 

30 the virus into the cell. (Wyatt, R., et al. (1998) Nature 393:705-71 1 ; Kwong, P., et al.(1998) 
Nature 393:648-659). Referring again to Figure 1, CD4 is bound into a depression formed at 
the interface of the outer domain, the inner domain and the bridging sheet of gpl20. 
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Immunogenicity of the gpl20 polypeptide has also been studied. For example, 
individuals infected by HIV-1 usually develop antibodies that can neutralize the virus in in 
vitro assays, and this response is directed primarily against linear neutralizing determinants in 
the third variable loop of gpl20 glycoprotein (Javahenan, K., et al. (1989) Proc. Natl. Acad. 
5 Sci, 86:6786-6772; Matsushita, M., et al. (1988) J. Virol. 62:2107-2144; Putney, S., et al. 
(1986) Science 234:1392-1395; Rushe, J. R., et al . (1988) Proc. Nat. Acad. Sci. USA 85: 
3198-3202.). However, these antibodies generally exhibit the ability to neutralize only a 
limited number of HIV-1 strains (Matthews, T. (1986) Proc. Natl. Acad. Sci. USA. 83:9709- 
9713; Nara, P. L., et al. (1988) J. Virol. 62:2622-2628; Palker, T. J., et al. (1988) Proc. Natl. 
1 0 Acad. Sci. USA. 85: 1 932-1 936). Later in the course of HIV infection in humans, antibodies 
capable of neutralizing a wider range of HIV-1 isolates appear (Barre-Sinoussi, F., et al. 
(1983) Science 220:868-871; Robert-Guroff, M., et al. (1985) Nature (London) 316:72-74; 
Weis, R., et al. (1985) Nature (London) 316:69-72; Weis, R., et al. (1 986) Nature (London) 
324:572-575). 

1 5 Recent work done by Stamatatos et al ( 1 998) AIDS Res Hum Retroviruses 

14(13): 1 129-39, shows that a deletion of the variable region 2 from a HIV-1 SF162 virus, which 
utilizes the CCR-5 co-receptor for virus entry, rendered the virus highly susceptible to serum- 
mediated neutralization. This V2 deleted virus was also neutralized by sera obtained from 
patients infected not only with clade B HIV-1 isolates but also with clade A, C, D and F HIV- 

20 1 isolates. However, deletion of the variable region 1 had no effect. Deletion of the variable 
regions 1 and 2 from a LAI isolate HIV-I 1IIB also increased the susceptibility to neutralization 
by monoclonal antibodies whose epitopes are located within the V3 loop, the CD4-binding 
site, and conserved gpl20 regions (Wyatt, R., et al. (1995) J Virol. 69:5723-5733). Rabbit 
immunogenicity studies done with the HIV-1 virus with deletions in the V1/V2 and V3 

25 region from the LAI strain, which uses the CXCR4 co-receptor for virus entry, showed no 
improvement in the ability of Env to raise neutralizing antibodies (Leu et al. (1998) AIDS 
Res. and Human Retroviruses. 14:151-155). 

Further, a subset of the broadly reactive antibodies, found in most infected 
individuals, interferes with the binding of gpl20 and CD4 (Kang, C.-Y., et al. (1991) Proc. 

30 Natl. Acad. Sci. USA. 88:6171-6175; McDougal, J. S., et al. (1986) J. Immunol. 137:2937- 
2944). Other antibodies are believed to bind to the chemokine receptor binding region after 
CD4 has bound to Env (Thali et al. (1993) J. Virol. 67:3978-3988). The fact that neutralizing 
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antibodies generated during the course of HIV infection do not provide permanent antiviral 
effect may in part be due to the generation of "neutralization escapes" virus mutants and to 
the general decline in the host immune system associated with pathogenesis. In contrast, the 
presence of pre-existing neutralizing antibodies upon initial HIV-1 exposure will likely have 
5 a protective effect. 

It is widely thought that a successful vaccine should be able to induce a strong, 
broadly neutralizing antibody response against diverse HIV-1 strains (Montefiori and Evans 
(1999) AIDS Res. Hum. Ret. 15(8):689-698; Bolognesi, D.,P, et al. (1994) Ann. Int. Med. 
8:603-611; Haynes, B., F., et al. (1996) Science ;271: 324-328.). Neutralizing antibodies, by 

10 attaching to the incoming virions, can reduce or even prevent their infectivity for target cells 
and prevent the cell-to-cell spread of virus in tissue culture (Hu et al. (1992) Science 255:456- 
459; Burton, D.,R. and Montefiori, D. (1997) AIDS ll(suppl. A): 587-598). However as 
described above, antibodies directed against gpl20 do not generally exhibit broad antibody 
responses against different HIV strains. 

15 Currently, the focus of vaccine development, from the perspective of humoral 

immunity, is on the neutralization of primary isolates that utilize the CCR5 chemokine co- 
receptor believed to be important in virus entry (Zhu, T., et al. (1993) Science 261:1 179- 
1181; Fiore, J., et al. (1994) Virology; 204:297-303). These viruses are generally much more 
resistant to antibody neutralization than T-cell line adapted strains that use the CXCR4 co- 

20 receptor, although both can be neutralized in vitro by certain broadly and potent acting 

monoclonal antibodies, such as IgGlbl2, 2G12 and 2F5 (Trkola, A., et al. (1995) J. Virol 
69:6609-6617; D'SousaPM., et al (1997)/. Infect. Dis. 175:1062-1075). These monoclonal 
antibodies are directed to the CD4 binding site, a glycosylation site and to the gp41 fusion 
domain, respectively. The problem that remains, however, is that it is not known how to 

25 induce antibodies of the appropriate specificity by vaccination. Antibodies (Abs) elicited by 
gpl20 glycoprotein from a given isolate are usually only able to neutralize closely related 
viruses generally from similar, usually from the same, HIV-1 subtype. 

Despite the above approaches, there remains a need for Env antigens that can elicit an 
immunological response {e.g., neutralizing and/or protective antibodies) in a subject against 

30 multiple HIV strains and subtypes, for example when administered as a vaccine. The present 
invention solves these and other problems by providing modified Env polypeptides {e.g., 
gp!20) to expose epitopes in or near the CD4 binding site. 
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Summary of the Invention 

In accordance with the present invention, modified HIV Env polypeptides are 
provided. In particular, deletions and/or mutations are made in one or more of the 4-P 
antiparallel-bridging sheet in the HIV Env polypeptide. In this way, enough structure is left 
5 to allow correct folding of the polypeptide, for example of gpl20, yet enough of the bridging 
sheet is removed to expose the CD4 groove, allowing an immune response to be generated 
against epitopes in or near the CD4 binding site of the Env polypeptide (e.g., gpl20). 

In one aspect, the invention includes a polynucleotide encoding a modified HIV Env 
polypeptide wherein the polypeptide has at least one modified (e.g., deleted or replaced) 

10 amino acid residue deleted in the region corresponding to residues 421 to 436 relative to 
HXB-2, for example the constructs depicted in Figures 6-29 (SEQ ID NOs:3 to 26). In 
certain embodiments, the polynucleotide also has the region corresponding to residues 124- 
198 of the polypeptide HXB-2 (e.g., V1/V2) deleted and at least one amino acid deleted or 
replaced in the regions corresponding to the residues 1 19 to 123 and 199 to 210, relative to 

15 HXB-2. In other embodiments, these polynucleotides encode Env polypeptides having at 
least one amino acid of the small loop of the bridging sheet (e.g., amino acid residues 427 to 
429 relative to HXB-2) deleted or replaced. The amino acid sequences of the modified 
polypeptides encoded by the polynucleotides of the present invention can be based on any 
HIV variant, for example SF162. 

20 In another aspect, the invention includes immunogenic modified HIV Env 

polypeptides having at least one modified (e.g., deleted or replaced) amino acid residue 
deleted in the region corresponding to residues 421 to 436 relative to HXB-2, for example a 
deletion or replacement of one amino acids in the small loop region (e.g., amino acid residues 
427 to 429 relative to HXB-2). These polypeptides may have modifications (e.g., a deletion 

25 or a replacement) of at least one amino acid between about amino acid residue 420 and amino 
acid residue 436, relative to HXB-2 and, optionally, may have deletions or truncations of the 
VI and/or V2 regions. The immunogenic, modified polypeptides of the present invention can 
be based on any HIV variant, for example SF162. 

In another aspect, the invention includes a vaccine composition comprising any of the 

30 polynucleotides encoding modified Env polypeptides described above. Vaccine 

compositions comprising the modified Env polypeptides and, optionally, an adjuvant are also 
included in the invention. 
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In yet another aspect, the invention includes a method of inducing an immune 
response in subject comprising, administering one or more of the polynucleotides or 
constructs described above in an amount sufficient to induce an immune response in the 
subject. In certain embodiments, the method further comprises administering an adjuvant to 
5 the subject. 

In another aspect, the invention includes a method of inducing an immune response in 
a subject comprising administering a composition comprising any of the modified Env 
polypeptides described above and an adjuvant. The composition is administered in an 
amount sufficient to induce an immune response in the subject. 
1 0 In another aspect, the invention includes a method of inducing an immune response in 

a subject comprising 

(a) administering a first composition comprising any of the polynucleotides described 
above in a priming step and 

(b) administering a second composition comprising any of the modified Env 

15 polypeptides described above, as a booster, in an amount sufficient to induce an immune 
response in the subject. In certain embodiments, the first composition, the second 
composition or both the first and second compositions further comprise an adjuvant. 

These and other embodiments of the subject invention will readily occur to those of 
skill in the art in light of the disclosure herein. 

20 

Brief Description of the Drawings 

Figure 1 is a schematic depiction of the tertiary structure of the HIV-1 HXB _, Env gpl20 
polypeptide, as determined by crystallography studies. 

Figures 2A-C depict alignment of the amino acid sequence of wild-type HIV-1 HXB2 
25 Env gpl60 polypeptide (SEQ ID NO:l) with amino acid sequence of HIV variants SF162 
(shown as "162") (SEQ ID NO:2), SF2, CM236 and US4. Arrows indicate the regions that 
are deleted or replaced in the modified polypeptides. Black dots indicate conserved cysteine 
residues. The star indicates the position of the last amino acid in gpl20. 

Figures 3A-J depict alignment of nucleotide sequences of polynucleotides encoding 
30 modified Env polypeptides having V 1/V2 deletions. The unmodified amino acid residues 

encoded by these sequences correspond to wildtype SF162 residues but are numbered relative 
to HXB-2. 
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Figures 4A-M depict alignment of nucleotide sequences of polynucleotides encoding 
modified Env polypeptides having deletions or replacements in the small loop. The 
unmodified amino acid residues encoded by these sequences correspond to wildtype SF162 
residues but are numbered relative to HXB-2. 
5 Figures 5A-N depict alignment of nucleotide sequences of polynucleotides encoding 

modified Env polypeptides having both VI /V2 deletions and, in addition, deletions or 
replacements in the small loop. The unmodified amino acid residues encoded by these 
sequences correspond to wildtype SF162 residues but are numbered relative to HXB-2. 

Figure 6 depicts the nucleotide sequence of the construct designated Vall20-Ala204 
10 (SEQIDNO:3). 

Figure 7 depicts the nucleotide sequence of the construct designated Vall20-Ile201 
(SEQ ID NO:4). 

Figure 8 depicts the nucleotide sequence of the construct designated Vall20-Ile201B 
(SEQ ID NO:5). 

1 5 Figure 9 depicts the nucleotide sequence of the construct designated Lys 1 2 1 - Val200 

(SEQ ID NO:6). 

Figure 10 depicts the nucleotide sequence of the construct designated Leul22-Serl99 
(SEQ ID NO:7). 

Figure 1 1 depicts the nucleotide sequence of the construct designated Vall20-Thr202 
20 (SEQ ID NO:8). 

Figure 12 depicts the nucleotide sequence of the construct designated Trp427-Gly431 
(SEQ ID NO:9). 

Figure 13 depicts the nucleotide sequence of the construct designated Arg426-Gly43 1 
(SEQ ID NO: 10). 

25 Figure 1 4 depicts the nucleotide sequence of the construct designated Arg426- 

Gly431B(SEQ ID NO: 11). 

Figure 15 depicts the nucleotide sequence of the construct designated Arg426-Lys432 
(SEQ ID NO: 12). 

Figure 16 depicts the nucleotide sequence of the construct designated Asn425-Lys432 
30 (SEQ ID NO: 13). 

Figure 17 depicts the nucleotide sequence of the construct designated Ile424-Ala433 
(SEQ ID NO: 14). 
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Figure 18 depicts the nucleotide sequence of the construct designated Ile423-Met434 
(SEQ ID NO: 15). 

Figure 19 depicts the nucleotide sequence of the construct designated Gln422-Tyr435 
(SEQ ID NO: 16). 

5 Figure 20 depicts the nucleotide sequence of the construct designated Gln422- 

Tyr435B (SEQ ID NO: 17). 

Figure 21 depicts the nucleotide sequence of the construct designated Leul22- 
Serl99;Arg426-Gly431 (SEQ ID NO: 18). 

Figure 22 depicts the nucleotide sequence of the construct designated Leul22- 
10 Serl99;Arg426-Lys432 (SEQ ID NO: 19). 

Figure 23 depicts the nucleotide sequence of the construct designated Leul22-Serl99; 
Trp427-Gly431 (SEQ ID NO:20). 

Figure 24 depicts the nucleotide sequence of the construct designated Lysl21-Val200; 
Asn425-Lys432 (SEQ ID NO:21). 
15 Figure 25 depicts the nucleotide sequence of the construct designated Vail 20-11 e201 ; 

Ile424-Ala433 (SEQ ID NO:22). 

Figure 26 depicts the nucleotide sequence of the construct designated Vall20- 
I16201B; Ile424-Ala433 (SEQ ID NO:23). 

Figure 27 depicts the nucleotide sequence of the construct designated Vall20-Thr202; 
20 Ile424-Ala433 (SEQ ID NO:24). 

Figure 28 depicts the nucleotide sequence of the construct designated Vall27-Asnl95 
(SEQ ID NO:25). 

Figure 29 depicts the nucleotide sequence of the construct designated Vall27- 
Asnl95; Arg426-Gly43 1 (SEQ ID NO:26). 

25 

Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of protein chemistry, viral immunobiology, molecular biology and 
recombinant DNA techniques within the skill of the art. Such techniques are explained fully 
30 in the literature. See, e.g., T.E. Creighton, Proteins: Structures and Molecular Properties 
(W.H. Freeman and Company, 1993); Nelson L.M. and Jerome H.K. HIV Protocols in 
Methods in Molecular Medicine, vol. 17, 1999; Sambrook, et al., Molecular Cloning: A 

8 
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Laboratory Manual (Cold Spring Harbor Laboratory, 1989); F.M. Ausubel et al. Current 
Protocols in Molecular Biology , Greene Publishing Associates & Wiley Interscience New 



singular forms "a", "an" and "the" include plural referents unless the content clearly dictates 
otherwise. Thus, for example, reference to "a polypeptide" includes a mixture of two or more 
polypeptides, and the like. 



In describing the present invention, the following terms will be employed, and are 

intended to be defined as indicated below. 

The terms "polypeptide," and "protein" are used interchangeably herein to denote any 

polymer of amino acid residues. The terms encompass peptides, oligopeptides, dimers, 
15 multimers, and the like. Such polypeptides can be derived from natural sources or can be 

synthesized or recombinantly produced. The terms also include postexpression modifications 

of the polypeptide, for example, glycosylation, acetylation, phosphorylation, etc. 

A polypeptide as defined herein is generally made up of the 20 natural amino acids 

Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gin (Q), Glu (E), Gly (G), His (H), He (I), Leu 
20 (L), Lys (K), Met (M), Phe (F), Pro (P), Ser (S), Thr (T), Trp (W), Tyr (Y) and Val (V) and 

may also include any of the several known amino acid analogs, both naturally occurring and 

synthesized analogs, such as but not limited to homoisolcucine, asaleucine, 2- 

(methylenecyclopropyl)glycine, S-methylcysteine, S-(prop-l-enyl)cysteine, homoserine, 

ornithine, norleucine, norvaline, homoarginine, 3-(3-carboxyphenyl)alanine, 
25 cyclohexylalanine, mimosine, pipecolic acid, 4-methyIglutamic acid, canavanine, 2,3- 

diaminopropionic acid, and the like. Further examples of polypeptide agents which will find 

use in the present invention are set forth below. 

By "geometry" or "tertiary structure" of a polypeptide or protein is meant the overall 

3-D configuration of the protein. As described herein, the geometry can be determined, for 
30 example, by crystallography studies or by using various programs or algorithms which 

predict the geometry based on interactions between the amino acids making up the primary 

and secondary structures. 
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By "wild type" polypeptide, polypeptide agent or polypeptide drug, is meant a 
naturally occurring polypeptide sequence, and its corresponding secondary structure. An 
"isolated" or "purified" protein or polypeptide is a protein which is separate and discrete from 
a whole organism with which the protein is normally associated in nature. It is apparent that 
5 the term denotes proteins of various levels of purity. Typically, a composition containing a 
purified protein will be one in which at least about 35%, preferably at least about 40-50%, 
more preferably, at least about 75-85%, and most preferably at least about 90% or more, of 
the total protein in the composition will be the protein in question. 

By "Env polypeptide" is meant a molecule derived from an envelope protein, 

10 preferably from HIV Env. The envelope protein of HIV- 1 is a glycoprotein of about 160 kd 
(gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gpl20 and the integral membrane protein, gp41 . The gp41 portion is anchored in (and 
spans) the membrane bilayer of virion, while the gpl20 segment protrudes into the 
surrounding environment. As there is no covalcnt attachment between gpl20 and gp41, free 

15 gpl20 is released from the surface of virions and infected cells. Env polypeptides may also 
include gpl40 polypeptides. Env polypeptides can exist as monomers, dimers or multimers. 

By a "gpl20 polypeptide" is meant a molecule derived from a gpl20 region of the 
Env polypeptide. Preferably, the gpl20 polypeptide is derived from HTV Env. The primary 
amino acid sequence of gpl20 is approximately 511 amino acids, with a polypeptide core of 

20 about 60,000 daltons. The polypeptide is extensively modified by N-linked glycosylation to 
increase the apparent molecular weight of the molecule to 120,000 daltons. The amino acid 
sequence of gpl20 contains five relatively conserved domains interspersed with five 
hypervariable domains. The positions of the 18 cysteine residues in the gpl20 primary 
sequence of the HIV-1 HXB _, (hereinafter "HXB-2") strain, and the positions of 13 of the 

25 approximately 24 N-linked glycosylation sites in the gpl20 sequence are common to most, if 
not all, gpl20 sequences. The hypervariable domains contain extensive amino acid 
substitutions, insertions and deletions. Despite this variation, most, if not all, gpl20 
sequences preserve the virus's ability to bind to the viral receptor CD4. A "gpl20 
polypeptide" includes both single subunits or multimers. 

30 Env polypeptides (e.g., gpl20, gp!40 and gpl 60) include a "bridging sheet" 

comprised of 4 anti-parallel P-strands (p-2, p-3, P-20 and p-21) that form a p-sheet. 
Extruding from one pair of the p-strands (P-2 and p-3) are two loops, VI and V2. The P-2 
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sheet occurs at approximately amino acid residue 119 (Cys) to amino acid residue 123 (Thr) 
while P-3 occurs at approximately amino acid residue 199 (Ser) to amino acid residue 201 
(lie), relative to HXB-2. The "V1/V2 region" occurs at approximately amino acid positions 
126 (Cys) to residue 196 (Cys), relative to HXB-2. (see, e.g., Wyatt et al. (1995) J. Virol. 
5 69:5723-5733; Stamatatos et al. (1998) J. Virol. 72:7840-7845). Extruding from the second 
pair of p-strands (P-20 and p-21) is a "small-loop" structure, also referred to herein as "the 
bridging sheet small loop." In HXB-2, p-20 extends from about amino acid residue 422 
(Gin) to amino acid residue 426 (Met) while p-21 extends from about amino acid residue 430 
(Val) to amino acid residue 435 (Tyr). In variant SF162, the Met-426 is an Arg (R) residue. 

10 The "small loop" extends from about amino acid residue 427 (Trp) through 429 (Lys), 

relative to HXB-2. A representative diagram of gpl20 showing the bridging sheet, the small 
loop, and V1/V2 is shown in Figure 1 . In addition, alignment of the amino acid sequences of 
Env polypeptide gpl60 of selected variants is shown, relative to HXB-2, in Figures 2A-C. 
Furthermore, an "Env polypeptide" or "gpl20 polypeptide" as defined herein is not 

1 5 limited to a polypeptide having the exact sequence described herein. Indeed, the HIV 

genome is in a state of constant flux and contains several variable domains which exhibit 
relatively high degrees of variability between isolates. It is readily apparent that the terms 
encompass Env (e.g., gpl20) polypeptides from any of the identified HIV isolates, as well as 
newly identified isolates, and subtypes of these isolates. Descriptions of structural features 

20 are given herein with reference to HXB-2. One of ordinary skill in the art in view of the 

teachings of the present disclosure and the art can determine corresponding regions in other 
HIV variants (e.g., isolates HIV IIlb , HIV SF2 , HIV-1 SF162 , HIV-1 SF170 , HIV LAV , HIV LA1 , HIV MN , 
HIV-1 CM235 „ HIV-1 US4 , other HIV-1 strains from diverse subtypes(e.g., subtypes, A through 
G, and O), HIV-2 strains and diverse subtypes (e.g., HIV-2 UC1 and HIV-2 UC2 ), and simian 

25 immunodeficiency virus (SIV). (See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 1988); 

Fundamental Virology, 2nd Edition (B.N. Fields and D.M. Knipe, eds. 1991); Virology, 3rd 
Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, Lippincott-Raven, Philadelphia, 
PA; for a description of these and other related viruses), using for example, sequence 
comparison programs (e.g., BLAST and others described herein) or identification and 

30 alignment of structural features (e.g., a program such as the "ALB" program described herein 
that can identify P-sheet regions). The actual amino acid sequences of the modified Env 
polypeptides can be based on any HIV variant. 
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Additionally, the term "Env polypeptide" (e.g., "gpl20 polypeptide") encompasses 
proteins which include additional modifications to the native sequence, such as additional 
internal deletions, additions and substitutions. These modifications may be deliberate, as 
through site-directed mutagenesis, or may be accidental, such as through naturally occurring 
5 mutational events. Thus, for example, if the Env polypeptide is to be used in vaccine 

compositions, the modifications must be such that immunological activity (i.e., the ability to 
elicit an antibody response to the polypeptide) is not lost. Similarly, if the polypeptides are to 
be used for diagnostic purposes, such capability must be retained. 

Thus, a "modified Env polypeptide" is an Env polypeptide (e.g., gpl20 as defined 

1 0 above), which has been manipulated to delete or replace all or a part of the bridging sheet 
portion and, optionally, the variable regions VI and V2. Generally, modified Env (e.g., 
gpl20) polypeptides have enough of the bridging sheet removed to expose the CD4 binding 
site, but leave enough of the structure to allow correct folding (e.g., correct geometry). Thus, 
modifications to the p-20 and P-21 regions (between about amino acid residues 420 and 435 

15 relative to HXB-2) are preferred. Additionally, modifications to the (i-2 and P-3 regions 
(between about amino acid residues 119 (Cys) and 201 (He)) and modifications (e.g., 
truncations) to the VI and V2 loop regions may also be made. Although not all possible 0- 
sheet and VI /V2 modifications have been exemplified herein, it is to be understood that other 
disrupting modifications are also encompassed by the present invention. 

20 Normally, such a modified polypeptide is capable of secretion into growth medium in 

which an organism expressing the protein is cultured. However, for purposes of the present 
invention, such polypeptides may also be recovered intracellularly. Secretion into growth 
media is readily determined using a number of detection techniques, including, e.g., 
polyacrylamide gel electrophoresis and the like, and immunological techniques such as 

25 Western blotting and immunoprecipitation assays as described in, e.g., International 
Publication No. WO 96/04301, published February 15, 1996. 

A gpl20 or other Env polypeptide is produced "intracellularly" when it is found 
within the cell, either associated with components of the cell, such as in association with the 
endoplasmic reticulum (ER) or the Golgi Apparatus, or when it is present in the soluble 

30 cellular fraction. The gpl20 and other Env polypeptides of the present invention may also be 
secreted into growth medium so long as sufficient amounts of the polypeptides remain 
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present within the cell such that they can be purified from cell lysates using techniques 
described herein. 

An "immunogenic" gpl20 or other Env protein is a molecule that includes at least one 
epitope such that the molecule is capable of either eliciting an immunological reaction in an 
5 individual to which the protein is administered or, in the diagnostic context, is capable of 
reacting with antibodies directed against the HIV in question. 

By "epitope" is meant a site on an antigen to which specific B cells and/or T cells 
respond, rendering the molecule including such an epitope capable of eliciting an 
immunological reaction or capable of reacting with HIV antibodies present in a biological 

10 sample. The term is also used interchangeably with "antigenic determinant" or "antigenic 
determinant site." An epitope can comprise 3 or more amino acids in a spatial conformation 
unique to the epitope. Generally, an epitope consists of at least 5 such amino acids and, more 
usually, consists of at least 8-10 such ammo acids. Methods of determining spatial 
conformation of amino acids are known in the art and include, for example, x-ray 

1 5 crystallography and 2-dimensional nuclear magnetic resonance. Furthermore, the 

identification of epitopes in a given protein is readily accomplished using techniques well 
known in the art, such as by the use of hydrophobicity studies and by site-directed serology. 
See, also, Geysen et al., Proc. Natl. Acad. Sci. USA (1984) 81:3998-4002 (general method of 
rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given 

20 antigen); U.S. Patent No. 4,708,871 (procedures for identifying and chemically synthesizing 
epitopes of antigens); and Geysen et al., Molecular Immunology (1986) 23:709-715 
(technique for identifying peptides with high affinity for a given antibody). Antibodies that 
recognize the same epitope can be identified in a simple immunoassay showing the ability of 
one antibody to block the binding of another antibody to a target antigen. 

25 An "immunological response" or "immune response" as used herein is the 

development in the subject of a humoral and/or a cellular immune response to the Env {e.g., 
gpl20) polypeptide when the polypeptide is present in a vaccine composition. These 
antibodies may also neutralize infectivity, and/or mediate antibody-complement or antibody 
dependent cell cytotoxicity to provide protection to an immunized host. Immunological 

30 reactivity may be determined in standard immunoassays, such as a competition assays, well 
known in the art. 
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Techniques for determining amino acid sequence "similarity" are well known in the 
art. In general, "similarity" means the exact amino acid to amino acid comparison of two or 
more polypeptides at the appropriate place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or hydrophobicity. A so-termed "percent 
5 similarity" then can be determined between the compared polypeptide sequences. 

Techniques for determining nucleic acid and amino acid sequence identity also are well 
known in the art and include determining the nucleotide sequence of the mRNA for that gene 
(usually via a cDNA intermediate) and determining the amino acid sequence encoded 
thereby, and comparing this to a second amino acid sequence. In general, "identity" refers to 

10 an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two 
polynucleotides or polypeptide sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more amino acid sequences likewise can be compared by 
determining their "percent identity." The percent identity of two sequences, whether nucleic 

1 5 acid or peptide sequences, is generally described as the number of exact matches between two 
aligned sequences divided by the length of the shorter sequence and multiplied by 100. An 
approximate alignment for nucleic acid sequences is provided by the local homology 
algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). 
This algorithm can be extended to use with peptide sequences using the scoring matrix 

20 developed by Dayhoff, Atlas of Protein Sequences and Structure, M.O. Dayhoff ed., 5 suppl. 
3:353-358, National Biomedical Research Foundation, Washington, D.C., USA, and 
normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An implementation of 
this algorithm for nucleic acid and peptide sequences is provided by the Genetics Computer 
Group (Madison, WI) in their BestFit utility application. The default parameters for this 

25 method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 
8 (1995) (available from Genetics Computer Group, Madison, WI). Other equally suitable 
programs for calculating the percent identity or similarity between sequences are generally 
known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
30 sequence can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. Another method of 
establishing percent identity in the context of the present invention is to use the MPSRCH 
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package of programs copyrighted by the University of Edinburgh, developed by John F. 
Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, CA). 
From this suite of packages, the Smith- Waterman algorithm can be employed where default 
parameters are used for the scoring table (for example, gap open penalty of 12, gap extension 
5 penalty of one, and a gap of six). From the data generated, the "Match" value reflects 

"sequence identity." Other suitable programs for calculating the percent identity or similarity 
between sequences are generally known in the art, such as the alignment program BLAST, 
which can also be used with default parameters. For example, BLASTN and BLASTP can be 
used with the following default parameters: genetic code = standard; filter = none; strand = 

10 both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = 
HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank 
CDS translations + Swiss protein + Spupdate + PIR. Details of these programs can be found 
at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

One of skill in the art can readily determine the proper search parameters to use for a 

1 5 given sequence in the above programs. For example, the search parameters may vary based 
on the size of the sequence in question. Thus, for example, a representative embodiment of 
the present invention would include an isolated polynucleotide having X contiguous 
nucleotides, wherein (i) the X contiguous nucleotides have at least about 50% identity to Y 
contiguous nucleotides derived from any of the sequences described herein, (ii) X equals Y, 

20 and (iii) X is greater than or equal to 6 nucleotides and up to 5000 nucleotides, preferably 
greater than or equal to 8 nucleotides and up to 5000 nucleotides, more preferably 10-12 
nucleotides and up to 5000 nucleotides, and even more preferably 15-20 nucleotides, up to 
the number of nucleotides present in the full-length sequences described herein (e.g., see the 
Sequence Listing and claims), including all integer values falling within the above-described 

25 ranges. 

The synthetic expression cassettes (and purified polynucleotides) of the present 
invention include related polynucleotide sequences having about 80% to 100%, greater than 
80-85%, preferably greater than 90-92%, more preferably greater than 95%, and most 
preferably greater than 98% sequence (including all integer values falling within these 
30 described ranges) identity to the synthetic expression cassette sequences disclosed herein (for 
example, to the claimed sequences or other sequences of the present invention) when the 
sequences of the present invention are used as the query sequence. 
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Computer programs are also available to determine the likelihood of certain 
polypeptides to form structures such as p-sheets. One such program, described herein, is the 
"ALB" program for protein and polypeptide secondary structure calculation and predication. 
In addition, secondary protein structure can be predicted from the primary amino acid 
5 sequence, for example using protein crystal structure and aligning the protein sequence 
related to the crystal structure (e.g.. using Molecular Operating Environment (MOE) 
programs available from the Chemical Computing Group Inc., Montreal, P.Q., Canada). 
Other methods of predicting secondary structures are described, for example, in Gamier et al. 
(1996) Methods Enzymol. 266:540-553; Geourjon et al. (1995) Comput. Applic. Biosci. 
10 11:681-684; Levin (1997) Protein Eng. 10:771-776; and Rost et al. (1993)7. Molec. Biol. 
232:584-599. 

Homology can also be determined by hybridization of polynucleotides under 
conditions which form stable duplexes between homologous regions, followed by digestion 
with single-stranded-specific nuclease(s), and size determination of the digested fragments. 

15 Two DNA, or two polypeptide sequences are "substantially homologous" to each other when 
the sequences exhibit at least about 80%-85%, preferably at least about 90%, and most 
preferably at least about 95%-98% sequence identity over a defined length of the molecules, 
as determined using the methods above. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA 

20 sequences that are substantially homologous can be identified in a Southern hybridization 
experiment under, for example, stringent conditions, as defined for that particular system. 
Defining appropriate hybridization conditions is within the skill of the art. See, e.g., 
Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization, supra. 

A "coding sequence" or a sequence which "encodes" a selected protein, is a nucleic 

25 acid sequence which is transcribed (in the case of DNA) and translated (in the case of 

mRNA) into a polypeptide in vitro or in vivo when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a start codon 
at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding 
sequence can include, but is not limited to cDNA from viral nucleotide sequences as well as 

30 synthetic and semisynthetic DNA sequences and sequences including base analogs. A 
transcription termination sequence may be located 3' to the coding sequence. 
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"Control elements" refers collectively to promoter sequences, ribosome binding sites, 
polyadenylation signals, transcription termination sequences, upstream regulatory domains, 
enhancers, and the like, which collectively provide for the transcription and translation of a 
coding sequence in a host cell. Not all of these control elements need always be present so 
5 long as the desired gene is capable of being transcribed and translated. 

A control element "directs the transcription" of a coding sequence in a cell when RNA 
polymerase will bind the promoter sequence and transcribe the coding sequence into mRNA, 
which is then translated into the polypeptide encoded by the coding sequence. 

"Operably linked" refers to an arrangement of elements wherein the components so 
10 described are configured so as to perform their usual function. Thus, control elements 

operably linked to a coding sequence are capable of effecting the expression of the coding 
sequence when RNA polymerase is present. The control elements need not be contiguous 
with the coding sequence, so long as they function to direct.the expression thereof. Thus, for 
example, intervening untranslated yet transcribed sequences can be present between, e.g., a 
1 5 promoter sequence and the coding sequence and the promoter sequence can still be 
considered "operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with 
20 which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to 
which it is linked in nature. The term "recombinant" as used with respect to a protein or 
polypeptide means a polypeptide produced by expression of a recombinant polynucleotide. 
"Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures," and other such 
terms denoting procaryotic microorganisms or eucaryotic cell lines cultured as unicellular 
25 entities, are used interchangeably, and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, and include the progeny of the 
original cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement to the original parent, due to accidental or deliberate mutation. Progeny of the 
30 parental cell which are sufficiently similar to the parent to be characterized by the relevant 
property, such as the presence of a nucleotide sequence encoding a desired peptide, are 
included in the progeny intended by this definition, and are covered by the above terms. 
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By "vertebrate subject" is meant any member of the subphylum chordata, including, 
without limitation, humans and other primates, including non-human primates such as 
chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, 
goats and horses; domestic mammals such as dogs and cats; laboratory animals including 
5 rodents such as mice, rats and guinea pigs; birds, including domestic, wild and game birds 
such as chickens, turkeys and other gallinaceous birds, ducks, geese, and the like. The term 
does not denote a particular age. Thus, both adult and newborn individuals are intended to be 
covered. 

As used herein, a "biological sample" refers to a sample of tissue or fluid isolated 

10 from an individual, including but not limited to, for example, blood, plasma, serum, fecal 
matter, urine, bone marrow, bile, spinal fluid, lymph fluid, samples of the skin, external 
secretions of the skin, respiratory, intestinal, and genitourinary tracts, samples derived from 
the gastric epithelium and gastric mucosa, tears, saliva, milk, blood cells, organs, biopsies 
and also samples of in vitro cell culture constituents including but not limited to conditioned 

15 media resulting from the growth of cells and tissues in culture medium, e.g., recombinant 
cells, and cell components. 

The terms "label" and "detectable label" refer to a molecule capable of detection, 
including, but not limited to, radioactive isotopes, fluorescers, chemiluminescers, enzymes, 
enzyme substrates, enzyme cofactors, enzyme inhibitors, chromophores, dyes, metal ions, 

20 metal sols, ligands (e.g., biotin or haptens) and the like. The term "fluorescer" refers to a 
substance or a portion thereof which is capable of exhibiting fluorescence in the detectable 
range. Particular examples of labels which may be used with the invention include, but are 
not limited to fluorescein, rhodamine, dansyl, umbelliferone, Texas red, luminol, acradimum 
esters, NADPH, a-P-galactosidase, horseradish peroxidase, glucose oxidase, alkaline 

25 phosphatase and urease. 

Overview 

The present invention concerns modified Env polypeptide molecules (e.g., 
glycoprotein ("gp") 120). Without being bound by a particular theory, it appears that it has 
30 been difficult to generate immunological responses against Env because the CD4 binding site 
is buried between the outer domain, the inner domain and the V1/V2 domains. Thus, 
although deletion of the V 1/V2 domain may render the virus more susceptible to 
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neutralization by monoclonal antibody directed to the CD4 site, the bridging sheet covering 
most of the CD4 binding domain may prevent an antibody response. Thus, the present 
invention provides Env polypeptides that maintain their general overall structure yet expose 
the CD4 binding domain. This allows the generation of an immune response (e.g., an 
5 antibody response) to epitopes in or near the CD4 binding site. 

Various forms of the different embodiments of the invention, described herein, may 
be combined. 



P-Sheet Conformations 

10 In the present invention, location of the P-sheet structures were identified relative to 

3-D (crystal) structure of an HXB-2 crystallized Env protein (see, Example 1 A). Based on 
this structure, constructs encoding polypeptides having replacements and or excisions which 
maintain overall geometry while exposing the CD4 binding site were designed. In particular, 
the crystal structure of HXB-2 was downloaded from the Brookhaven Database. Using the 

15 default parameters of the Loop Search feature of the Biopolymer module of the Sybyl 

molecular modeling package, homology and fit of amino acids which could replace the native 
loops between p-strands yet maintain overall tertiary structure were determined. Constructs 
encoding the modified Env polypeptides were then designed (Example I.B.). 

Thus, the modified Env polypeptides typically have enough of the bridging sheet 

20 removed to expose the CD4 groove, but have enough of the structure to allow correct folding 
of the Env glycoprotein. Exemplary constructs are described below. 

Polypeptide Production 

The polypeptides of the present invention can be produced in any number of ways 
25 which are well known in the art. 

In one embodiment, the polypeptides are generated using recombinant techniques, 
well known in the art. In this regard, oligonucleotide probes can be devised based on the 
known sequences of the Env (e.g., gpl20) polypeptide genome and used to probe genomic or 
cDNA libraries for Env genes. The gene can then be further isolated using standard 
30 techniques and, e.g., restriction enzymes employed to truncate the gene at desired portions of 
the full-length sequence. Similarly, the Env gene(s) can be isolated directly from cells and 
tissues containing the same, using known techniques, such as phenol extraction and the 



19 



WO 00/39303 PCT/US99/31272 
sequence further manipulated to produce the desired truncations. See, e.g., Sambrook et al., 
supra, for a description of techniques used to obtain and isolate DNA. 

The genes encoding the modified (e.g., truncated and/or substituted) polypeptides can 
be produced synthetically, based on the known sequences. The nucleotide sequence can be 
5 designed with the appropriate codons for the particular amino acid sequence desired. The 
complete sequence is generally assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge (1981) 
Nature 292:756; Nambair et al. (1984) Science 223:1299; Jay et al (1984) J. Biol. Chem. 
259:6311; Stemmer et al. (1995) Gene 164:49-53. 

10 Recombinant techniques are readily used to clone a gene encoding an Env 

polypeptide gene which can then be mutagenized in vitro by the replacement of the 
appropriate base pair(s) to result in the codon for the desired amino acid. Such a change can 
include as little as one base pair, effecting a change in a single amino acid, or can encompass 
several base pair changes. Alternatively, the mutations can be effected using a mismatched 

1 5 primer which hybridizes to the parent nucleotide sequence (generally cDNA corresponding to 
the RNA sequence), at a temperature below the melting temperature of the mismatched 
duplex. The primer can be made specific by keeping primer length and base composition 
within relatively narrow limits and by keeping the mutant base centrally located. See, e.g., 
Innis et al, (1990) PCR Applications: Protocols for Functional Genomics; Zoller and Smith, 

20 Methods Enzymol. (1 983) 100:468. Primer extension is effected using DNA polymerase, the 
product cloned and clones containing the mutated DNA, derived by segregation of the primer 
extended strand, selected. Selection can be accomplished using the mutant primer as a 
hybridization probe. The technique is also applicable for generating multiple point 
mutations. See, e.g., Dalbie-McFarland et al. Proc. Natl. Acad. Sci USA (1982) 79:6409. 

25 Once coding sequences for the desired proteins have been isolated or synthesized, 

they can be cloned into any suitable vector or replicon for expression. As will be apparent 
from the teachings herein, a wide variety of vectors encoding modified polypeptides can be 
generated by creating expression constructs which operably link, in various combinations, 
polynucleotides encoding Env polypeptides having deletions or mutation therein. Thus, 

30 polynucleotides encoding a particular deleted VW2 region can be operably linked with 
polynucleotides encoding polypeptides having deletions or replacements in the small loop 
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region and the construct introduced into a host cell for polypeptide expression. Non-limiting 
examples of such combinations are discussed in the Examples. 

Numerous cloning vectors are known to those of skill in the art, and the selection of 
an appropriate cloning vector is a matter of choice. Examples of recombinant DNA vectors 
5 for cloning and host cells which they can transform include the bacteriophage X (E. coli), 
pBR322 (E. coli), pACYC177 (E. coli), pKT230 (gram-negative bacteria), pGVl 106 
(gram-negative bacteria), pLAFRl (gram-negative bacteria), pME290 (non-£. coli 
gram-negative bacteria), pHV14 (E. coli and Bacillus subtilis), pBD9 {Bacillus), pIJ61 
{Streptomyces), pUC6 {Streptomyces), YIp5 (Saccharomyces), YCpl9 {Saccharomyces) and 

10 bovine papilloma virus (mammalian cells). See, generally, DNA Cloning: Vols. I & II, supra; 
Sambrook et al, supra; B. Perbal, supra. 

Insect cell expression systems, such as baculovirus systems, can also be used and are 
known to those of skill in the art and described in, e.g., Summers and Smith, Texas 
Agricultural Experiment Station Bulletin No. 1555 (1987). Materials and methods for 

15 baculovirus/insect cell expression systems are commercially available in kit form from, inter 
alia, Invitrogen, San Diego CA ("MaxBac" kit). 

Plant expression systems can also be used to produce the modified Env proteins. 
Generally, such systems use virus-based vectors to transfect plant cells with heterologous 
genes. For a description of such systems see, e.g., Porta et al., Mol. Biotech. (1996) 5:209- 

20 221; and Hackland et al., Arch. Virol. (1994) 139:1-22. 

Viral systems, such as a vaccinia based infection/transfcction system, as described in 
Tomei et al., ./. Virol. (1993) 67:4017-4026 and Selby et al„ J. Gen. Virol (1993) 
74:1 103-1 1 13, will also find use with the present invention. In this system, cells are first 
transfected in vitro with a vaccinia virus recombinant that encodes the bacteriophage T7 RNA 

25 polymerase. This polymerase displays exquisite specificity in that it only transcribes 

templates bearing T7 promoters. Following infection, cells are transfected with the DNA of 
interest, driven by a T7 promoter. The polymerase expressed in the cytoplasm from the 
vaccinia virus recombinant transcribes the transfected DNA into RNA which is then 
translated into protein by the host translational machinery. The method provides for high 

30 level, transient, cytoplasmic production of large quantities of RNA and its translation 
product(s). 
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The gene can be placed under the control of a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator (collectively referred to herein as "control" 
elements), so that the DNA sequence encoding the desired Env polypeptide is transcribed into 
RNA in the host cell transformed by a vector containing this expression construction. The 
5 coding sequence may or may not contain a signal peptide or leader sequence. With the 

present invention, both the naturally occurring signal peptides or heterologous sequences can 
be used. Leader sequences can be removed by the host in post-translational processing. See, 
e.g., U.S. Patent Nos. 4,431,739; 4,425,437; 4,338,397. Such sequences include, but are not 
limited to, the TP A leader, as well as the honey bee mellitin signal sequence. 

10 Other regulatory sequences may also be desirable which allow for regulation of 

expression of the protein sequences relative to the growth of the host cell. Such regulatory 
sequences are known to those of skill in the art, and examples include those which cause the 
expression of a gene to be turned on or off in response to a chemical or physical stimulus, 
including the presence of a regulatory compound. Other types of regulatory elements may 

1 5 also be present in the vector, for example, enhancer sequences. 

The control sequences and other regulatory sequences may be ligated to the coding 
sequence prior to insertion into a vector. Alternatively, the coding sequence can be cloned 
directly into an expression vector which already contains the control sequences and an 
appropriate restriction site. 

20 In some cases it may be necessary to modify the coding sequence so that it may be 

attached to the control sequences with the appropriate orientation; i.e., to maintain the proper 
reading frame. Mutants or analogs may be prepared by the deletion of a portion of the 
sequence encoding the protein, by insertion of a sequence, and/or by substitution of one or 
more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such 

25 as site-directed mutagenesis, are well known to those skilled in the art. See, e.g., Sambrook 
et al, supra; DNA Cloning, Vols. I and II, supra; Nucleic Acid Hybridization, supra. 

The expression vector is then used to transform an appropriate host cell. A number of 
mammalian cell lines are known in the art and include immortalized cell lines available from 
the American Type Culture Collection (ATCC), such as, but not limited to, Chinese hamster 

30 ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells 

(COS), human hepatocellular carcinoma cells (e.g., Hep G2), Vero293 cells, as well as others. 
Similarly, bacterial hosts such as E. coli, Bacillus subtilis, and Streptococcus spp., will find 
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use with the present expression constructs. Yeast hosts useful in the present invention 
include inter alia, Saccharomyces cerevisiae, Candida albicans, Candida maltosa, 
Hansenula polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, Pichia guillerimondii, 
Pichia pastoris, Schizosaccharomyces pombe and Yarrowia lipolytica. Insect cells for use 
5 with baculovirus expression vectors include, inter alia, Aedes aegypti, Autographa 
californica, Bombyx mori, Drosophila melanogaster, Spodoptera frugiperda, and 
Trichoplusia ni. 

Depending on the expression system and host selected, the proteins of the present 
invention are produced by growing host cells transformed by an expression vector described 

10 above under conditions whereby the protein of interest is expressed. The selection of the 
appropriate growth conditions is within the skill of the art. 

In one embodiment, the transformed cells secrete the polypeptide product into the 
surrounding media. Certain regulatory sequences can be included in the vector to enhance 
secretion of the protein product, for example using a tissue plasminogen activator (TPA) 

15 leader sequence, a y-interferon signal sequence or other signal peptide sequences from known 
secretory proteins. The secreted polypeptide product can then be isolated by various 
techniques described herein, for example, using standard purification techniques such as but 
not limited to, hydroxyapatite resins, column chromatography, ion-exchange 
chromatography, size-exclusion chromatography, electrophoresis, HPLC, immunoadsorbent 

20 techniques, affinity chromatography, immunoprecipitation, and the like.. 

Alternatively, the transformed cells are disrupted, using chemical, physical or 
mechanical means, which lyse the cells yet keep the Env polypeptides substantially intact. 
Intracellular proteins can also be obtained by removing components from the cell wall or 
membrane, e.g., by the use of detergents or organic solvents, such that leakage of the Env 

25 polypeptides occurs. Such methods are known to those of skill in the art and are described in, 
e.g., Protein Purification Applications: A Practical Approach, (E.L.V. Harris and S. Angal, 
Eds., 1990) 

For example, methods of disrupting cells for use with the present invention include 
but are not limited to: sonication or ultrasonication; agitation; liquid or solid extrusion; heat 
30 treatment; freeze-thaw; desiccation; explosive decompression; osmotic shock; treatment with 
lytic enzymes including proteases such as trypsin, neuraminidase and lysozyme; alkali 
treatment; and the use of detergents and solvents such as bile salts, sodium dodecylsulphate, 
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Triton, NP40 and CHAPS. The particular technique used to disrupt the cells is largely a 
matter of choice and will depend on the cell type in which the polypeptide is expressed, 
culture conditions and any pre-treatment used. 

Following disruption of the cells, cellular debris is removed, generally by 
5 centrifugation, and the intracellularly produced Env polypeptides are further purified, using 
standard purification techniques such as but not limited to, column chromatography, ion- 
exchange chromatography, size-exclusion chromatography, electrophoresis, HPLC, 
immunoadsorbent techniques, affinity chromatography, immunoprecipitation, and the like. 
For example, one method for obtaining the intracellular Env polypeptides of the 

1 0 present invention involves affinity purification, such as by immunoaffinity chromatography 
using anti-Env specific antibodies, or by lectin affinity chromatography. Particularly 
preferred lectin resins are those that recognize mannose moieties such as but not limited to 
resins derived from Galanthus nivalis agglutinin (GNA), Lens culinaris agglutinin (LCA or 
lentil lectin), Pisum sativum agglutinin (PSA or pea lectin), Narcissus pseudonarcissus 

15 agglutinin (NPA) and Allium ursinum agglutinin (AUA). The choice of a suitable affinity 
resin is within the skill in the art. After affinity purification, the Env polypeptides can be 
further purified using conventional techniques well known in the art, such as by any of the 
techniques described above. 

It may be desirable to produce Env {e.g., gpl20) complexes, either with itself or other 

20 proteins. Such complexes are readily produced by e.g., co-transfecting host cells with 
constructs encoding for the Env {e.g., gpl20) and/or other polypeptides of the desired 
complex. Co-trans fection can be accomplished either in trans or cis, i.e., by using separate 
vectors or by using a single vector which bears both of the Env and other gene. If done using 
a single vector, both genes can be driven by a single set of control elements or, alternatively, 

25 the genes can be present on the vector in individual expression cassettes, driven by individual 
control elements. Following expression, the proteins will spontaneously associate. 
Alternatively, the complexes can be formed by mixing the individual proteins together which 
have been produced separately, either in purified or semi-purified form, or even by mixing 
culture media in which host cells expressing the proteins, have been cultured. See, 

30 International Publication No. WO 96/04301, published February 15, 1996, for a description 
of such complexes. 
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Relatively small polypeptides, i.e., up to about 50 amino acids in length, can be 
conveniently synthesized chemically, for example by any of several techniques that are 
known to those skilled in the peptide art. In general, these methods employ the sequential 
addition of one or more amino acids to a growing peptide chain. Normally, either the amino 
5 or carboxyl group of the first amino acid is protected by a suitable protecting group. The 
protected or derivatized amino acid can then be either attached to an inert solid support or 
utilized in solution by adding the next amino acid in the sequence having the complementary 
(amino or carboxyl) group suitably protected, under conditions that allow for the formation of 
an amide linkage. The protecting group is then removed from the newly added amino acid 
10 residue and the next amino acid (suitably protected) is then added, and so forth. After the 
desired amino acids have been linked in the proper sequence, any remaining protecting 
groups (and any solid support, if solid phase synthesis techniques are used) are removed 
sequentially or concurrently, to render the final polypeptide. By simple modification of this 
general procedure, it is possible to add more than one amino acid at a time to a growing 
15 chain, for example, by coupling (under conditions which do not racemize chiral centers) a 
protected tripeptide with a properly protected dipeptide to form, after deprotection, a 
pentapeptide. See, e.g., J. M. Stewart and J. D. Young, Solid Phase Peptide Synthesis 
(Pierce Chemical Co., Rockford, IL 1984) and G. Barany and R. B. Merrifield, The Peptides: 
Analysis, Synthesis. Biology , editors E. Gross and J. Meienhofer, Vol. 2, (Academic Press, 




20 New York, 1 980), pp. 3-254, for solid phase peptide synthesis techniques; and M. Bodansky, 
Principles of Peptide Synthesis , (Springer-Verlag, Berlin 1984) and E. Gross and J. 



25 fluorenylmethoxycarbonyl (Fmoc) benzyloxycarbonyl (Cbz); p-toluenesulfonyl (Tx); 2,4- 
dinitrophenyl; benzyl (Bzl); biphenylisopropyloxycarboxy-carbonyl, t- 
amyloxycarbonyl, isobornyloxycarbonyl, o-bromobenzyloxycarbonyl, cyclohexyl, isopropyl, 
acetyl, o-nitrophenylsulfonyl and the like. 



30 divinylbenzene cross-linked-styrene-based polymers, for example, divinylbenzene- 

hydroxymethylstyrene copolymers, divinylbenzene-chloromethylstyrene copolymers and 
divinylbenzene-benzhydrylaminopolystyrene copolymers. 





Typical solid supports are cross-linked polymeric supports. These can include 



25 



WO 00/39303 PCT/US99/31272 

The polypeptide analogs of the present invention can also be chemically prepared by 
other methods such as by the method of simultaneous multiple peptide synthesis. See, e.g., 
Houghten Proc. Natl. Acad. Sci. USA (1985) 82:5131-5135; U.S. Patent No. 4,631,21 1. 

5 Diagnostic and Vaccine Applications 

The intracellularly produced Env polypeptides of the present invention, complexes 
thereof, or the polynucleotides coding therefor, can be used for a number of diagnostic and 
therapeutic purposes. For example, the proteins and polynucleotides or antibodies generated 
against the same, can be used in a variety of assays, to determine the presence of reactive 

1 0 antibodies/and or Env proteins in a biological sample to aid in the diagnosis of HIV infection 
or disease status or as measure of response to immunization. 

The presence of antibodies reactive with the Env (e.g., gpl20) polypeptides and, 
conversely, antigens reactive with antibodies generated thereto, can be detected using 
standard electrophoretic and immunodiagnostic techniques, including immunoassays such as 

15 competition, direct reaction, or sandwich type assays. Such assays include, but are not 

limited to, western blots; agglutination tests; enzyme-labeled and mediated immunoassays, 
such as ELISAs; biotin/avidin type assays; radioimmunoassays; immunoelectrophoresis; 
immunoprecipitation, etc. The reactions generally include revealing labels such as 
fluorescent, chemi luminescent, radioactive, or enzymatic labels or dye molecules, or other 

20 methods for detecting the formation of a complex between the antigen and the antibody or 
antibodies reacted therewith. 

Solid supports can be used in the assays such as nitrocellulose, in membrane or 
microtiter well form; polyvinylchloride, in sheets or microliter wells; polystyrene latex, in 
beads or microtiter plates; polyvinylidine fluoride; diazotized paper; nylon membranes; 

25 activated beads, and the like. 

Typically, the solid support is first reacted with the biological sample (or the gpl20 
proteins), washed and then the antibodies, (or a sample suspected of containing antibodies), 
applied. After washing to remove any non-bound ligand, a secondary binder moiety is added 
under suitable binding conditions, such that the secondary binder is capable of associating 

30 selectively with the bound ligand. The presence of the secondary binder can then be detected 
using techniques well known in the art. Typically, the secondary binder will comprise an 
antibody directed against the antibody ligands. A number of anti-human immunoglobulin 
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(Ig) molecules are known in the art (e.g., commercially available goat anti-human Ig or rabbit 
anti-human Ig). Ig molecules for use herein will preferably be of the IgG or IgA type, 
however, IgM may also be appropriate in some instances. The Ig molecules can be readily 
conjugated to a detectable enzyme label, such as horseradish peroxidase, glucose oxidase, 
5 Beta-galactosidase, alkaline phosphatase and urease, among others, using methods known to 
those of skill in the art. An appropriate enzyme substrate is then used to generate a detectable 
signal. 

Alternatively, a "two antibody sandwich" assay can be used to detect the proteins of 
the present invention. In this technique, the solid support is reacted first with one or more of 

10 the antibodies directed against Env (e.g., gpl20), washed and then exposed to the test sample. 
Antibodies are again added and the reaction visualized using either a direct color reaction or 
using a labeled second antibody, such as an antiimmunoglobulin labeled with horseradish 
peroxidase, alkaline phosphatase or urease. 

Assays can also be conducted in solution, such that the viral proteins and antibodies 

15 thereto form complexes under precipitating conditions. The precipitated complexes can then 
be separated from the test sample, for example, by centrifugation. The reaction mixture can 
be analyzed to determine the presence or absence of antibody-antigen complexes using any of 
a number of standard methods, such as those immunodiagnostic methods described above. 
The modified Env proteins, produced as described above, or antibodies to the 

20 proteins, can be provided in kits, with suitable instructions and other necessary reagents, in 
order to conduct immunoassays as described above. The kit can also contain, depending on 
the particular immunoassay used, suitable labels and other packaged reagents and materials 
(i.e. wash buffers and the like). Standard immunoassays, such as those described above, can 
be conducted using these kits. 

25 The Env polypeptides and polynucleotides encoding the polypeptides can also be used 

in vaccine compositions, individually or in combination, in e.g., prophylactic (i.e., to prevent 
infection) or therapeutic (to treat HIV following infection) vaccines. The vaccines can 
comprise mixtures of one or more of the modified Env proteins (or nucleotide sequences 
encoding the proteins), such as Env (e.g., gpl20) proteins derived from more than one viral 

30 isolate. The vaccine may also be administered in conjunction with other antigens and 
immunoregulatory agents, for example, immunoglobulins, cytokines, lymphokines, and 
chemokines, including but not limited to IL-2, modified IL-2 (cysl25-serl25), GM-CSF, IL- 
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12, y -interferon, IP- 10, MIPip and RANTES. The vaccines may be administered as 
polypeptides or, alternatively, as naked nucleic acid vaccines (e.g., DNA), using viral vectors 
(e.g., retroviral vectors, adenoviral vectors, adeno-associated viral vectors) or non-viral 
vectors (e.g., liposomes, particles coated with nucleic acid or protein). The vaccines may also 
5 comprise a mixture of protein and nucleic acid, which in turn may be delivered using the 
same or different vehicles. The vaccine may be given more than once (e.g., a "prime" 
administration followed by one or more "boosts") to achieve the desired effects. The same 
composition can be administered as the prime and as the one or more boosts. Alternatively, 
different compositions can be used for priming and boosting. 

10 The vaccines will generally include one or more "pharmaceutically acceptable 

excipients or vehicles" such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary 
substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may 
be present in such vehicles. 

A carrier is optionally present which is a molecule that does not itself induce the 

15 production of antibodies harmful to the individual receiving the composition. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycollic acids, polymeric amino acids, amino acid 
copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. 
Such carriers are well known to those of ordinary skill in the art. Furthermore, the Env 

20 polypeptide may be conjugated to a bacterial toxoid, such as toxoid from diphtheria, tetanus, 
cholera, etc. 

Adjuvants may also be used to enhance the effectiveness of the vaccines. Such 
adjuvants include, but are not limited to: (1) aluminum salts (alum), such as aluminum 
hydroxide, aluminum phosphate, aluminum sulfate, etc.; (2) oil-in-water emulsion 

25 formulations (with or without other specific immunostimulating agents such as muramyl 
peptides (see below) or bacterial cell wall components), such as for example (a) MF59 
(International Publication No. WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 
0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not 
required) formulated into submicron particles using a micro fluidizer such as Model HOY 

30 microfluidizer (Microfluidics, Newton, MA), (b) SAF, containing 10% Squalane, 0.4% 
Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP (see below) either 
microfluidized into a submicron emulsion or vortexed to generate a larger particle size 
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emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) 
containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), 
and cell wall skeleton (C WS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, 
5 such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IF A); (5) cytokines, such as 
interleukins (IL-1, IL-2, etc.), macrophage colony stimulating factor (M-CSF), tumor necrosis 
factor (TNF), etc.; (6) detoxified mutants of a bacterial ADP-nbosylating toxin such as a 

1 0 cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), particularly 
LT-K63 (where lysine is substituted for the wild-type amino acid at position 63) LT-R72 
(where arginine is substituted for the wild-type amino acid at position 72), CT-S109 (where 
serine is substituted for the wild-type amino acid at position 109), and PT-K9/G129 (where 
lysine is substituted for the wild-type amino acid at position 9 and glycine substituted at 

15 position 129) (see, e.g., International Publication Nos. W093/13202 and W092/19265); and 
(7) other substances that act as immunostimulating agents to enhance the effectiveness of the 
composition. 

Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme (nor-MDP), N- 

20 acetylmurarnyl-L-alanyl-D-isogluatminyl-L-alanine-2-(r-2'-dipalmitoyl-s/t-glycero-3- 
huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Typically, the vaccine compositions are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. The preparation also may be emulsified or 

25 encapsulated in liposomes for enhanced adjuvant effect, as discussed above. 

The vaccines will comprise a therapeutically effective amount of the modified Env 
proteins, or complexes of the proteins, or nucleotide sequences encoding the same, and any 
other of the above-mentioned components, as needed. By "therapeutically effective amount" 
is meant an amount of a modified Env {e.g., gpl20) protein which will induce a protective 

30 immunological response in the uninfected, infected or unexposed individual to which it is 
administered. Such a response will generally result in the development in the subject of a 
secretory, cellular and/or antibody-mediated immune response to the vaccine. Usually, such 
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a response includes but is not limited to one or more of the following effects; the production 
of antibodies from any of the immunological classes, such as immunoglobulins A, D, E, G or 
M; the proliferation of B and T lymphocytes; the provision of activation, growth and 
differentiation signals to immunological cells; expansion of helper T cell, suppressor T cell, 
5 and/or cytotoxic T cell. 

Preferably, the effective amount is sufficient to bring about treatment or prevention of 
disease symptoms. The exact amount necessary will vary depending on the subject being 
treated; the age and general condition of the individual to be treated; the capacity of the 
individual's immune system to synthesize antibodies; the degree of protection desired; the 

1 0 severity of the condition being treated; the particular Env polypeptide selected and its mode 
of administration, among other factors. An appropriate effective amount can be readily 
determined by one of skill in the art. A "therapeutically effective amount" will fall in a 
relatively broad range that can be determined through routine trials. 

Once formulated, the nucleic acid vaccines may be accomplished with or without viral 

15 vectors, as described above, by injection using either a conventional syringe or a gene gun, 
such as the Accell® gene delivery system (PowderJect Technologies, Inc., Oxford, England). 
Delivery of DNA into cells of the epidermis is particularly preferred as this mode of 
administration provides access to skin-associated lymphoid cells and provides for a transient 
presence of DNA in the recipient. Both nucleic acids and/or peptides can be injected either 

20 subcutaneously, epidermally, intradermally, intramucosally such as nasally, rectally and 
vaginally, intraperitoneally, intravenously, orally or intramuscularly. Other modes of 
administration include oral and pulmonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a single 
dose schedule or a multiple dose schedule. Administration of nucleic acids may also be 

25 combined with administration of peptides or other substances. 

While the invention has been described in conjunction with the preferred specific 
embodiments thereof, it is to be understood that the foregoing description as well as the 
examples which follow are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will be 

30 apparent to those skilled in the art to which the invention pertains. 
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Experimental 

Below are examples of specific embodiments for carrying out the present invention. 
The examples are offered for illustrative purposes only, and are not intended to limit the 
scope of the present invention in any way. 
5 Efforts have been made to ensure accuracy with respect to numbers used (e.g., 

amounts, temperatures, etc.), but some experimental error and deviation should, of course, be 
allowed for. 

Example 1 

10 A.l. Best-Fit and Homolog y Searches 

The crystal structure of HXB-2 gp 120 was downloaded from the Brookhaven 
database (COMPLEX (HIV ENVELOPE PROTEIN/CD4/FAB) 15-JUN-98 1GC1 
TITLE: HIV-1 GP120 CORE COMPLEXED WITH CD4 AND A NEUTRALIZING 
HUMAN ANTIBODY). Beta strands 3, 2, 21, and 20 of gp 120 form a sheet near the CD4 

15 binding site. Strands P-3 and P-2 are connected by the V1/V2 loop. Strands p-21 and P-20 
are connected by another small loop. The H-bonds at the interface between strands P-2 and 
P-21 are the only connection between domains of the "lower" half of the protein (joining 
helix alpha 1 to the CD4 binding site). This beta sheet and these loops mask some antigens 
(e.g., antigens which may generate neutralizing antibodies) that are only exposed during the 

20 CD4 binding. 

Constructs that remove enough of the beta sheet to expose the antigens in the CD4 
binding site, but leave enough of the protein to allow correct folding were designed. 
Specifically targeted were modifications to the small loop and, optional deletion of the V1/V2 
loops. Three different types of constructs were designed: (1) constructs encoding 

25 polypeptides that leave the number of residues making up the entire 4-strand beta sheet intact, 
but replace one or more residues; (2) constructs that encode polypeptide having at least one 
residue of at least one beta strand excised or (3) constructs encoding polypeptides having at 
least two residues of at least one beta strand excised. Thus, a total of 6 different turns were 
needed to rejoin the ends of the strands. 

30 Initially, residues in the small loop (residues 427-430, relative to HXB-2) and 

connected beta strands (P-20 and P-21) were modified to contain Gly and Pro (common in 
beta turns). These sequences were then used as the target to match in each search. The 
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geometry of the target was matched to known proteins in the Brookhaven Protein Data Bank. 
In particular, 5-residue turns (including an overlapping single residue at the N-terminal, the 2 
residue target turn and 2 overlapping residues at the C-terminal) were searched in the 
databases. In other words, these modified loops add a 2 residue turn that should be able to 
5 support a geometry that will maintain the beta-sheet structure of the wild type protein. The 
calculations were performed using the default parameters in the Loop Search feature of the 
Biopolymer module of the Sybyl molecular modeling package. In each case, the 25 best fits 
based on geometry alone were reviewed and, of those, several selected for homology and fit. 
In addition, it was also determined what modifications could be made to remove most 

10 of the V1/V2 loop (residues 124-198, relative to HXB-2) yet leave the geometry of the 

protein intact. As with the small loop, constructs were also designed which excised one or 
more residues from the 0-2 strand (residues 119-123 of HXB-2), the (3-3 strand (residues 199- 
201 of HXB-2) or both 0-2 and 0-3. For these constructs, known loops were searched to 
match the geometry of a pcntamer (including two remaining residues from the N-terminal 

15 side, a 2 residue turn and 1 C-terminal residue). For these searches, Gly-Gly was preferred as 
the insert along with at least one C-terminal substitution. 

A. 2. Small Loop Replacements 

In one aspect, the native sequence was replaced with residues that expose the CD4 
20 binding site, but leave the overall geometry of the protein relatively unchanged. For the 
small loop replacements, the target to match was: ASN425-MET426-GLY427-GLY428- 
GLY43 1 . Results of the search are summarized in Table 1 . 

Table 1: Search of Small Loop (Asn425 through Gly431) 

25 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit 


LYS-ASP-SER-ASN-ASN 


0.16689 


62.5 


27 


3 


TYR-GLY-LEU-GLY-LEU 


0.220308 


62.5 


28 


4 


GLU-ARG-GLU-ASP-GLY 


0.241754 


62.5 


29 


7 


ARG-LYS-GLY-GLY-ASN 


0.24881 


100 


30 


12 


TRP-THR-GLY-SER-TYR 


0.26417 


83.33 


31 
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Based on these results, constructs encoding Gly-Gly (#7), Gly-Ser (#12) or Gly-Gly- 
Asn (#7) were recommended. 

As V 1/V2 and one or more residues of (3-2 and 0-3 are also optionally deleted in the 
modified polypeptides of the invention, known loops to match the geometry of the V1/V2 
5 loop were also searched. The V1/V2 loop the target to match was: Lysl21-Leu-122-Glyl23- 
Glyl24-Serl99. Some notable matches are shown in Table 2: 



Table 2: Search of V1/V2 loop (Lysl21 through Serl99) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id. No. 


Best fit 


GLN-VAL-HIS-ASP-GLU 


0.154764 


68.75 


32 


2 


LYS-GLU-GLY-ASP-LYS 


0.15718 


81.25 


33 


9 


ARG-SER-GLY-ARG-SER 


0.173731 


68.75 


34 


11 


THR-LEU-GLY-ASN-SER 


0.175554 


81.25 


35 


16 


HIS-PHE-GLY-ALA-GLY 


0.178772 


93.75 


36 



15 

Based on these searches, constructs encoding Gly-Asn in place of V1/V2 were 
recommended. 

A. 3. One Additional Residue Excisions 
20 For a slightly truncated small loop, one more residue was trimmed from each beta 

strand to slightly shorten the beta sheet. The target to match was: ILE424-ASN425- 
GLY426-GLY427-LYS432. Results arc shown in Table 3: 



Table 3: Search of Beta sheet shortened by One residue (Ile424 through Lys432) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit: 


ARG-MET-ALA-PRO-VAL 


0.316805 


58.33 


37 


Best 
horn: 


ASP-SER-ASP-GLY-PRO 


0.440896 


83.33 


38 



33 



WO 00/39303 PCT/US99/31272 
Although these searches showed more variation and worse fits than the previous 
truncation, the Pro-Val or Pro-Leu encoding constructs were very similar. Accordingly, Ala- 
Pro encoding constructs were recommended. 

Sequences encoding gpl20 polypeptides having V1/V2 deleted and an additional 
5 residue from p-2 or p-3 excised were also searched. The V1/V2 loop the target to match was: 
VAL120-LYS121-GLY122-GLY123-VAL200. Some notable matches are shown in Table 4. 



Table 4: Search of V1/V2 loop (Vail 20 through Val200) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-VAL-ASP-PRO-TYR 


0.400892 


58.33333 


39 


2 


SER-THR-ASN-PRO-LEU 


0.402575 


54.16667 


40 


3 


THR-ARG-SER-PRO-LEU 


0.403965 


58.33333 


41 


7 


ARG-MET-ALA-PRO-VAL 


0.440118 


58.33333 


42 



15 

The construct encoding Ala-Pro (e.g., #7) was recommended. 
A.4. Further Excisions 

In yet another truncation, an additional residue was trimmed from the P-20 and (3-21 
20 strands to further shorten the beta sheet. The target to match was ILE423-ILE424-GLY425- 
GLY426-ALA433. Notable matches are shown in Table 5. 



Table 5: Search of Beta sheet shortened by Two Residues (Ue423 through Ala433) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-TYR-GLU-GLY-VAL 


0.130107 


79.16666 


43 


2 


GLN-VAL-GLY-ASN-THR 


0.138245 


79.16666 


44 


3: 


THR-VAL-GLY-GLY-ILE 


0.153362 


100 


45 



A construct encoding Gly-Gly (e.g., #3), which has 100% homology, was 
30 recommended. 
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Also searched were sequences encoding a deleted V1/V2 region and at least two 
residues excised from P-2, P-3 or at least one residue excised from P-2 and P-3. The target to 
match was: CYS1 19-VAL120-GLY121-GLY122-ILE201. Notable matches are shown in 
Table 6. 

5 



Table 6: Search of V1/V2 loop (Cysl 19 through Ile201) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


ASP-LEU-PRO-GLY-CYS 


0.250501 


75 


46 


4 


ASP-VAL-GLY-GLY-LEU 


0.290383 


100 


47 



10 

It was determined that both constructs would be used. 
B.l . Constructs encoding modified Env polypeptides 

As described above, the native loops extruding from the 4-P antiparallel-stands were 
15 excised and replaced with 1 to 3 residue turns. The loops were replaced so as to leave the 
entire p-strands or excised by trimming one or more amino acid from each side of the 
connected strands. The ends of the strands were rejoined with 

turns that preserve the same backbone geometry (e.g., tertiary structure of P-20 and p-21), as 
determined by searching the Brookhaven Protein Data Bank. 
20 Table 7A is a summary of the truncations of the variable regions 1 and 2 

recommended for this study, as determined in Example 1 A. above. 
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Table 7 A 
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V1/V2 Modifications 


SEQ ID NO 


Figure 


-LEU 1 22-GLY- ASN-SER1 99 


7 


10 


-LYS 1 2 1 - ALA- PRO-VAL200- 


6 


9 


-VAL120-GLY-GLY-ILE201 - 


4 


7 


- V AL 1 20-PRO-GL Y-ILE20 1 B- 


5 


8 


-V AL 1 20-GL Y- ALA-GLY-ALA204- 


3 


6 


-V AL 1 20-GL Y-GLY-AL A-THR202- 


8 


11 


-V AL 1 27 GL Y-ALA-GL Y-ASN 195- 


25 


28 



10 

As previously noted, the polypeptides encoded by the constructs of the present 
invention are numbered relative to HXB-2, but the particular amino acid residue of the 
polypeptides encoded by these exemplary constructs is based on SF-162. Thus, for example, 
although amino acid residue 195 in HXB-2 is a serine (S), constructs encoding polypeptides 
15 having then wild type SF162 sequence will have an asparagine (N) at this position. Table 7B 
shows just three of the variations in amino acid sequence between strains HXB-2 and SF162. 
The entire sequences, including differences in residue and amino acid number, of HXB-2 and 
SF162 are shown in the alignment of Figure 2 (SEQ ID NOs:l and 2). 

20 Table 7B 



HXB-2 amino 
acid number 


HXB-2 Residue 


SF162 Residue/amino acid number 


128 


Serine (S) 


Thr (T)/114 


195 


Serine (S) 


Asn (N)/188 


426 


Met (M) 


Arg(R)/411 



Constructs containing deletions in the P-20 strand, p-21 stand and small loop were 
also constructed. Shown in Table 8 are constructs encoding truncations in these regions. The 
constructs in Table 8 are numbered relative to HXB-2 but the unmodified amino acid 
30 sequence is based on SF162. Thus, the construct encodes an arginine (Arg) as is found in 
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SF162 in the amino acid position numbered 426 relative to HXB-2 (See, also, Table 7B). 
Changes from wildtype (SF162) are shown in bold in Table 8B. 



Table 8 



Small Loop/p-20 and p-21 (Modified) 


SEQ ID NO 


Figure 


-TRP427-GLY-GLY431- 


9 


12 


- ARG426-GLY-GLY-GLY43 1 - 


10 


13 


-ARG426-GLY-SER-GLY43 1 B- 


11 


14 


-ARG426-GLY-GLY-ASN-LYS432- 


12 


15 


-ASN425-ALA-PRO-LYS432- 


13 


16 


-ILE424-GLY-GLY-ALA433- 


14 


17 


-ILE423-GLY-GLY-MET434- 


15 


18 


GLN422-GLY-GLY-TYR435- 


16 


19 


-GLN422-ALA-PRO-TYR435B- 


17 


20 



15 

The deletion constructs shown in Tables 7 and 8 for each one of the p-strands and 
combinations of them are constructed. These deletions will be tested in the Env forms gpl20, 
gpl40 and gpl60 from different HIV strains like subtype B strains (e.g., SF162, US4, SF2), 
subtype E strains (e.g., CM235) and subtype C strains (e.g., AF1 10968 or AF1 10975). 
20 Exemplary constructs for SF1 62 are shown in the 

Figures and are summarized in Table 9. As noted above in Figure 2 and Table 7B, in the 
bridging sheet region, the amino acid sequence of SF162 differs from HXB-2 in that the 
Met426 of HXB-2 is an Arg in SF162. In Table 9, V1/V2 refers to deletions in the V1/V2 
region; # bsm refers to a modification in the bridging sheet small loop. 

25 



Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/ Amino acid sequence 


Vall20-Ala204 


3 


6 


V1/V2: Vall20-Gly-Ala-G!y-Ala204 


ValI20-Ile201 


4 


7 


V1/V2: Vall20-Gly-Gly-Ile201 


Vall20-Ile201B 


5 


8 


V1/V2: Vall20-Pro-Gly-Ile201 


Lysl21-Val200 


6 


9 


V1/V2: Lysl21-Ala-Pro-Val200 
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Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/ Amino acid sequence 


Leul22-Serl99 


7 


10 


V1/V2: Leul22-Gly-Asn-Serl99 


Vall20-Thr202 


8 


11 


V1/V2: Vall20-Gly-Gly-Ala-Thr202 


Trp427-Gly431 


9 


12 


bsm:Trp427-Gly-Gly431 


Arg426-Gly431 


10 


13 


bsm: Arg426-Gly-Gly-Gly431 


Arg426-Gly431B 


11 


14 


bsm: Arg426-GIy-Ser-Gly431 


Arg426-Lys432 


12 


15 


bsm: Arg426-Gly-Gly-Asn-Lys432 


Asn425-Lys432 


13 


16 


bsm: Asn425-Ala-Pro-Lys432 


Ile424-Ala433 


14 


17 


bsm: Ile424-Gly-Gly-Ala433 


Ile423-Met434 


15 


18 


bsm: Ile423-Gly-Gly-Met434 


Gln422-Tyr435 


16 


19 


bsm: Gln422-Gly-Gly-Tyr435 


Vall27-Asnl95 


25 


28 


bsm: Vall27-Gly-Ala-Gly-Asnl95 


Gln422-Tyr435B 


17 


20 


bsm: Gln422-Ala-Pro-TyT435 


Leul22-Serl99; 
Arg426-Gly431 






\71 Af)/kotM> I on 100 r*l«j A cm Cp»r10Q &.TciAlf*. 

v i/vz/Dsm. Leuizz-*jiy-Asn-c>erivy — Arg4/o- 
Gly-Gly-Gly431 


Leul22-Serl99; 
Arg426-Lys432 






Vl/V2/bsnv I eul22-Glv-Asn-Serl99 — - Arg426- 
Gly-GIy-Asn-Lys432 


Leu 1 22-Ser 1 99-Trp427- 
Gly431 






Vl/V2/bsm' Leul22-GIv-Asn-Serl99 Trp427- 

Gly-Gly431 


Lysl21-Val200- 
Asn425-Lys432 






V3/V2/bsm" Lvsl21-AIa-Pro-Val200 Asn425- 

Ala-Pro-Lys432 


Val 1 20-Ile20 1 -Ile424- 
Ala433 






Vl/V2/bsm: Vall20-Gly-Glv-Ile201 Ile424- 

Gly-Gly-Ala433 


Vall20-Ile201B-Ile424- 
Ala433 


23 


26 


Vl/V2/bsm: Vall20-Pro-Gly-Ile201 — Ile424- 
Gly-Gly-Ala43 


Vall20-Thr202; Ile424- 
Ala433 


24 


27 


Vl/V2/bsm: Vall20-Gly-Gly-Ala-Thr202 — 
Ile424-Gly-Gly-Ala433 


Vall27-Asnl95; 
Arg426-Gly431 


25 


29 


Vl/V2/bsm: Vail 27-Gly-Ala-Gly-Asn 195 — 
Arg426-Gly-Gly-Gly431 



Combinations of V1/V2 deletions and bridging sheet small loop modifications in 
addition to those specifically shown in Table 9 are also within the scope of the present 
invention. Various forms of the different embodiments of the invention, described herein, 
may be combined. 
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The first screening will be done after transient expression in COS-7, RD and/or 293 
cells. The proteins that are expressed will be analyzed by immunoblot, ELISA, and for 
binding to mAbs directed to the CD4 binding site and other important epitopes on gpl20 to 
determine integrity of structure. They will also be tested in a CD4 binding assay and, in 
5 addition, the binding of neutralizing antibodies, for example using patient sera or mAb 448D 
(directed to Glu370 and Tyr384, a region of the CD4 binding groove that is not altered by the 
deletions). 

The immunogenicity of these novel Env glycoproteins will be tested in rodents and 
primates. The structures will be administered as DNA vaccines or adjuvanted protein 
10 vaccines or in combined modalities. The goal of these vaccinations will be to archive broadly 
reactive neutralizing antibody responses. 
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What is claimed is: 

5 1 . A polynucleotide encoding a modified HIV Env polypeptide wherein the 

polypeptide has at least one amino acid deleted or replaced in the region corresponding to 
residues 420 to 436 relative to HXB-2 (SEQ ID NO:l). 

2. The polynucleotide of claim 1, wherein the region corresponding to residues 124- 
10 198 relative to HXB-2 is deleted and at least one amino acid is deleted or replaced in the 

regions corresponding to the residues 1 19 to 123 and 199 to 210 relative to HXB-2 (SEQ ID 
NO:l). 

3. The polynucleotide of claim 1, wherein at least one amino acid in the region 

1 5 corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO: 1 ) is deleted or 
replaced. 

4. The polynucleotide of claim 2, wherein at least one amino acid of the in the region 
corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO:l) is deleted or 

20 replaced. 

5. The polynucleotide of claim 1, wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

25 6. An immunogenic modified HIV Env polypeptide having at least one amino acid 

deleted or replaced in the region corresponding to residues 420 through 436, relative to HXB- 
2 (SEQ ID NO:l). 

7. The polypeptide of claim 6, wherein one amino acid is deleted in the region 
30 corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO: 1 ). 
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8. The polypeptide of claim 6, wherein more than one amino acid is deleted in the 
region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 



9. The polypeptide of claim 6, wherein at least one amino acid is replaced in the 
5 region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 

10. The polypeptide of claim 6, wherein at least one amino acid residue between 
about amino acid residue 427 and amino acid residue 429 relative to HXB-2 (SEQ ID NO:l) 
is deleted or replaced. 

10 

1 1 . The polypeptide of claim 6, wherein the VI and V2 regions of the polypeptide are 
truncated. 

12. The polypeptide of claim 10, wherein the VI and V2 regions of the polypeptide 
15 are truncated. 

13. The polypeptide of claim 6, wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

20 14. A construct comprising the nucleotide sequence depicted in Figure 6 (SEQ ID 

NO:3). 

15. A construct comprising the nucleotide sequence depicted in Figure 7 (SEQ ID 

NO:4). 

25 

16. A construct comprising the nucleotide sequence depicted in Figure 8 (SEQ ID 

NO:5). 

17. A construct comprising the nucleotide sequence depicted in Figure 9 (SEQ ID 

30 NO:6). 



41 



WO 00/39303 PCT/US99/31272 

1 8. A construct comprising the nucleotide sequence depicted in Figure 10 (SEQ ID 

NO:7). 

19. A construct comprising the nucleotide sequence depicted in Figure 1 1 (SEQ ID 



20. A construct comprising the nucleotide sequence depicted in Figure 12 (SEQ ID 

NO:9). 

10 21 . A construct comprising the nucleotide sequence depicted in Figure 1 3 (SEQ ID 

NO: 10). 

22. A construct comprising the nucleotide sequence depicted in Figure 14 (SEQ ID 
NO:ll). 

15 

23. A construct comprising the nucleotide sequence depicted in Figure 15 (SEQ ID 
NO: 12). 

24. A construct comprising the nucleotide sequence depicted in Figure 1 6 (SEQ ID 
20 NO: 13). 

25. A construct comprising the nucleotide sequence depicted in Figure 1 7 (SEQ ID 
NO:14). 

25 26. A construct comprising the nucleotide sequence depicted in Figure 1 8 (SEQ ID 

NO: 15). 

27. A construct comprising the nucleotide sequence depicted in Figure 19 (SEQ ID 
NO: 16). 

30 

28. A construct comprising the nucleotide sequence depicted in Figure 20 (SEQ ID 
NO: 17). 
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29. A construct comprising the nucleotide sequence depicted in Figure 21 (SEQ ID 
NO: 18). 



30. A construct comprising the nucleotide sequence depicted in Figure 22 (SEQ ID 
5 NO: 19). 

3 1 . A construct comprising the nucleotide sequence depicted in Figure 23 (SEQ ID 
NO:20). 

10 32. A construct comprising the nucleotide sequence depicted in Figure 24 (SEQ ID 

NO:21). 

33. A construct comprising the nucleotide sequence depicted in Figure 25 (SEQ ID 
NO:22). 

15 

34. A construct comprising the nucleotide sequence depicted in Figure 26 (SEQ ID 
NO:23). 

35. A construct comprising the nucleotide sequence depicted in Figure 27 (SEQ ID 
20 NO:24). 

36. A construct comprising the nucleotide sequence depicted in Figure 28 (SEQ ID 
NO:25). 

25 37. A construct comprising the nucleotide sequence depicted in Figure 29 (SEQ ID 

NO:26). 

38. A vaccine composition comprising a polynucleotide encoding a modified Env 
polypeptide according to any one of claims 1-5. 

30 

39. A vaccine composition comprising a polynucleotide construct encoding a 
modified Env polypeptide according to any of claims 14-37. 
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40. A vaccine composition comprising a modified Env polypeptide according to any 
of claims 6-13. 



41 . The vaccine composition of any of claims 38-40, further comprising an adjuvant. 

42. A method of inducing an immune response in subject comprising, administering a 
polynucleotide according to any one of claims 1-5 in an amount sufficient to induce an 
immune response in the subject. 

43. A method of inducing an immune response in subject comprising, administering a 
polynucleotide construct according to any one of claims 14-37 in an amount sufficient to 
induce an immune response in the subject. 

44. A method of inducing an immune response in a subject comprising administering 
a composition comprising a modified Env polypeptide according to any one of claims 6-13, 
wherein the composition is administered in an amount sufficient to induce an immune 
response in the subject 

45. The method of any of claims 42-44 further comprising administering an adjuvant 
to the subject. 

46. A method of inducing an immune response in a subject comprising 

(a) administering a first composition comprising a polynucleotide according to any of 
claims 1-5 in a priming step and 

(b) administering a second composition comprising a modified Env polypeptide 
according to any of claims 6-13, as a booster, in an amount sufficient to induce an immune 
response in the subject. 

47. The method of claim 46 wherein the first composition or second composition 
further comprise an adjuvant. 
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48. The method of claim 46 wherein the first and second compositions further 
comprise an adjuvant. 
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" ACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGKGGCCACCACCACCCTGTTCTGCGCCA 
CCCGTG7GGAAGGSSGCCACCACCACC VG1 CTG G Cf 
C :G1 jTGGAi 3Gfl 5C( A( 32 ..ACCCTGTTCTGCGCCA 
161 200 
GCGAC£CvAAGG( TA GACACC * TG ACAACGTGT 
GCGACGCCAAGGCCTACGACAC'C..-A "GTGCACAACGTGTG 
GC ACGf AAGG CTS 3ACA AC T CACAACGTCT( 
CC&AC GACAC CG A "~C \ :AACGTGTG 
iCCGAGGTGCACAACGTGTG 
ITGCACAACGTGTG 
:CGAGGTGCACAACGTGTG 
GCGACGCCAAGG 7 r -> ^CACAACGTGTG 

201 240 
GGCCACCCAC3CCTGCGTG CC CCG ' iACCCC A 
GC CCACCCACGCC :G 3TGCCCA' CGACCCCAACCCCCAG 
GG CACCCACGCCT rGCCCACi < iA CCA 
GGCCAC CAC :CT __ r _ CCACCG CC ? CC'C? 
GCCACCC.a 3CGTGCCCACCGA( 
t ACC ACGC GCGT : ACCGACO A? CA 
GGCCACCCACGC CTGC TGCC CACCGACCCCAACCCCCAG 
GCC ^CCGACCCCAACCCCCAG 
241 280 
A ;A CG CTG( AG - - - C „ — ( T - 1 i 
GAGATCGTGCTGGAG/^ACGTGACCGAGAACTTCAACATGT 
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GAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGT 
GAGATCGTGCTG 5AGA7- 3GTGACCG7; SAACTTCAACATGT 
GAGATCGTGCTGGAGAACG rGACCGAC AACTTCAACATGT 
GAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGT 
GAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGT 
GAGATCGTGC aAGAj ■' SCCGPG AACTTCAACATGT 
281 320 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACAACATGGTGGAGC ^GATGC ACGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCAT 
GGAAGAACA^i - G - AT ~ J TAGGACATCAT 

321 360 
CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

CAGCCTGTGGGACC AGAGC CTC V GC _CTGCGTGCC 

CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGG 

CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGG 

CAGCCTGTGGGA 3AGCCTGAAGC TGCGTGGG 

CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGG— 
CAGCCTGTGGGACCA'o^ AP CTGCGTG 
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(371) T at ACCC? GCCT ZC CAAGGTG G r TTCGAGCCCAT 
(401: ATCACCCAGGf rQCC 2AAGG.TG GCTTCGAGCCCAT 

(259: /■.ufT'.i 'X" AAGGTG/ CTTCGAGCCCAT 

(33?: - CGCC GCCTC CCCAAGGTGAGCTTCGAGCCCAT 
(359' ATCAC AC - CC CAAGGTGAGCTTCGAGCCCA1 
(3333 „ T CCAG ' G C\» GTGAG PTCGAG CCAT 

(365) "CATC?' r r GC T CC CAAGGTGA TTCGACCCC^T 
(401; T 1 T i 1 1 

441 480 
I CCCCATCCACrACTGCGCCCeCGCCGGC'i'TCGCCATCCTG 
:aa:\ CCCCATCCAC7ACTGCGCCCCCGCCGGC-;TCGCCATCCTG 
: 3 c 0 i CCCCATC'CACTACVGCGCC.:CCGCCGGCTTCGCCATCCTG 
(393) CCCCATJGA / 3 ?GCGCCCC< GC< < SCTTCGCCATCCTG 
. ? a u •, CCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 
'3941 CCCCATCCACTAi TGCGO CCCG CGGCTTCGO ATCCTG 
4 V 3 ) CCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 
(441) CCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 

481 520 

(451) aagtgcaacgacaagaagttcXacggcagcggcccctgca 
,43i) aagtgcaacgacaagaagttcaacggcagcggcccctgca 

(439) AAGTGCAACGACAAGAAGf TCAACGGCAGCGGCCCCTGCA 

( A 3 5 » ) aagtgcaacgacaagaagttcaacggcagcggcccctgca 
( A 3 9 ) aagtgcaacgacaagaagttcaacggcagcggcccctgca 
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AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCA 
AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCA 
AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCA 
521 560 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGT.GAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
CCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCC 
561 600 
CGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCC 
CGTGGTGAGCACCCAGCTGCTGCf GAACGGCAGCCTGGCC 
CGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCC 

, , b ACC TGCTGCTGAACGGCAGC > u 
CG GG 3AGCA C C AC CTGC1 jC" GAACGGCAGCCTGGCC 

GT 3 Zt< "A 1 " T '< TGJ TGC1 GAACGGCAGCCTGGCC 
CGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCC 
CGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCC 
601 640 
3£ r\G 3CGT TG I G iGCGAGAA T " n r- 

AGGAGGGC T GTGATl GCAGCGAGA, T I ' ;GA 
;TGGTGATCCGCAGCG7iGAACTT F 
GAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACA 
GAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACA 
GAGGAGGGC~" GTG IT ' GCA< JGAGAACTTCACCGACA 
GAGGAGGGCGTGGTGATCCG " C A( AAC H : :GA 

Z ( T( GTGATCCGCAGCGAGAACTTCACCGACA 
641 680 

p ft TCGTGCAGC1 h lCI" 

ACGCCAAGACCAXCA1 ^GTGCAGCTGAAGGAGAGCI TGG£ 
ACGCCAAGACCATCATCGTGCA.GCTGAAGGAG iG< GTGG 
AC 3 CCAAGACCA7 C ATCGTGCAGCTGAAGGAGAGCGTGG A 

Is&iiiii' ~" " ' 

acgcc/Cagaccai 

ACGCC y\GACCAl CATCG 1GCAGCTGAAGGAI AGC GTGG/ 

acgccaagaccatcatcgtgcagctgaaggagagcgtgga 

681 720 
CA"~A CTGCACCCGCCCCAACAACAACACCC CAAGAG 
-ATXAA'CTGCACCCGCCCCAACAAC? G£ 
GATCAACTGCACCCGCCCCAACAACAACACCCGC AAGAGC 
^ATC;^CTGCACCCGCCCCAACAACAACAC"-Cr/^^GAG.: 

:,a:-caactgcacccgccccaacaac;^.c.'^:ccgcaagagc 
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GATCAA PG C( GC C AACAACAA ACC AAGAG 

GATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGC 
721 7 60 
ATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCG 
ATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCG 

tcaccatcggc ggcc ttctp 
atcaccatcggccccggccgcgcct: ta 3caccg cc 

- :at< ggccccggccgcgccttck 
atcaccatcggccccggccgcgccttctacgccaccggcg 
atcaccatcggccccggccgcgccttctacgccaccggcg 
atcaccatcggccccggccgcgccttctacgccaccggcg 



FIG. 3C 



WO 00/39303 




heu T 22 Serl99 


(7 ' 


Vall27-Asnl95 


(761) 


Vall20-Ile201B 


(719) 


Val 120-Ala2 0 4 


(713) 


Vall20-Ile201 




Val 1 2 0-Thr 2 02 


(719) 


L.ysl21-Val200 


(725) 


Consensus 


(761) 


Leul22-Ser 1 9 9 


(771 ) 


Vall27-Asnl95 


(801) 


Vall20-Ile201B 


(759) 


Val 120-Ala2 04 


(753) 


Val 120- 1 le2 01 


(759) 


Vall20-Thr2 02 


(759) 


Lysl21-Val200 


(765) 


Consensu^ 


(801 ) 


Leul22-Serl 99 


(811 ) 


Vall27-Asnl95 


(841) 


Vall20-Ile201B 


(799) 


Val 120-Ala204 




Vall20-Ile201 


(799) 


Vall20-Thr2 02 


(799) 


Lys 121 -Val 2 00 


(805) 




(841) 


Leul22-Ser 1 99 


(851 ) 


Vall27-Asnl95 


(881) 


Vall20-Ile201B 


(839) 


Vall20-Ala204 




Vall20-Ile201 


(839) 


Vall20-Thr202 


(839) 


Lysl21-Val200 


(845) 






Leul22-Serl99 


(891) 


Vall27-Asnl95 


(921) 


Vall20-Ile201B 


(879) 


Vall20-Ala204 




Vall20-Ile201 


(879) 


Vall20-Thr202 


(879) 


Lysl21-Val200 




Consensus 


(921) 


Leul22-Serl99 


(931) 


Vall27-Asnl95 


(961) 


Vall20-Ile201B 


(919) 


Vall20-Ala204 




Vall20-Ile201 


(919) 


Vall20-Thr202 


(919) 


Lysl21-Val200 


(925) 


Consensus 


(961) 


Leul22-Serl99 


(971) 


Vall27-Asnl95 


(1001) 



65 



761 

ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
ACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAG 
801 840 
CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
GC h%M GAACAACACCCTGAAGCAGATCGTGACC 
CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
CGGCG AGAAG T G GAACAACAC CC T GAAGCAGATCGTGACC 
C GCGAGA7! TG hp ~A"C^CGCTGAAGCAGATCGTGACC 
CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
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800 



841 



880 



AAC rG 3GC AGTl G C A( AAG ATC T 
At I A jCCCAGTT GGCAACAAGACCATCGTGTTCA 
AAGCIGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCA 
- -f,r ^GGCCCAGTTCGGCViCAAGACCATCGTGTTCA 
^ c~ "AC ~ \GTTCGC AA AAGAC P Z1 TC 
- & 3G XCAGTTCl C r A _r ^ acCRTI GT rCfl 

m rccA - :agtt gg< .caagagc 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCA 
881 920 
ftGCAj AGCAl CO CGGCC iC< ( GAG ATC : , T rGCAC AC 
iGAGC GGCGACCC T ATGCAC? 

;CAGCGGCGGCC CCCGAGA1 C \ 

\C : GCGACC 3AGATCGTGAT ^Ca 

P :agai :a ggcggcgacc cg gatc a a & 

»GC GAGC/ r CGSCGGCGACCCCGP GA'l CGTGATG ACAG 
AGCAGAGCAG GCGGCGACt GGAGfl 5 ^ ~ p 

- ;C/ iGCA ' GGCGGCGACCCCGAGATCGTGATGCACAG 
921 960 
CTTCAACTGCGGCGGCGAGTTCT1 ;TACTGC AA 1AGCA' ' 

;ttcaactgcgg 1 cgagttcttc - t ; acagcac 
cttcaactgcgc cggcgagttcttct? ci " * acagcac' 

; " iA*. G 'GvCL GAGTTCTT rACTG at, AGCA 
CTTCAA( i - 3G AGTTC T 1 AC GCAC 
" * C GCC - AGTTC I C ACAGCS 
TTCAACTGCGGCGGCGAGTTCTTC1 :TG hP. \GCAC 
CTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACC 
961 1000 
A< TGT1 AACAG r ZC r . GAACAACAC ATCGGCCCC/ 
GCT I AGCA :TGGM ACCATCGGCCC 
' AGCTGTTCAACAGCA XTGGAACAACACCAI 3i XO 
CAGCTGTTCAACAGCACCTGGAACAA CAC \» 3 C GGCCCCA 
CTG1 iACAGCACCTGGP AAC? A r ~ E 
CAGCTGTTCAACAC-CACCTGGAACAACACCATCGGCCCCA 
CAGCTGTTCAACAGCACCTGGAACAACACCA7CGGCCCCA 
CAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCA 
1001 1040 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
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ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
ACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 
1041 1080 
GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 

i g; i a tc aaccg ;tggcagga ^gic-ggcaaggccatg 

GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG . 

GAG 1 Z rCAACC ( 1 ,C-P , AGG A1 

GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 
GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 
GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 
GCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCA7G 
1081 H20 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 
1121 1160 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
ACATC/iCCGGCCTGCTGCTGACCCGCGACGGCGGOAAGGA 
ACATCACCGGCCfdCTGCTG'ACCCGCGACGGCGGCAAGGA 
ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGA 
1161 1200 

jp GCA/ T - ~< <~ T W'CTTCCGC CCGGC CGC 
GATCAGi :AACACCACCG \GAT€TTCCGCCCCGGCGGCGGi : 
G/VTCAGCAACACCACCGAGATCTTC GCC :GG jGCGGC 
GAT JaGCAACACCACCGA \" TTCCGC ( 3CGG 

< -< -agc; ac »cc xg; jAt r r f " < 

ATCA( CAAG CCAC( AGATCXTCCG < 
lT Z C AC \ T 

GATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGC 
1201 1240 
h ,C, TGC< r r f q r AG c i Tf 

GACATGCGCG? T "'(^ AGCT I :AAGT J 

it c A TGGOGCAGCGAG i AGTA 
GACATGC G " r " r i T AG V C TA AAGTAC 
jA " rGCGCCAl AACTGt ( GCAGCGAG 1 ,T :AAGTACA 
GACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACA 

ACAT : ~GA AACT G< GCAGCGA _ ' i - AAGTA 
GACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACA 
1241 1280 
AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAA 
A 3G rGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAA 
AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCGCCACCAA 
AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAA 
AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAA 
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AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAA 

j ;gtggi ;aagatcgag : :cctgggcgtggcccccaccaa 

1281 1320 

ggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 
ggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 

; aagcg gtggtgcagcgc \agcgcgccg 
ggccaagcgccgcgtggtgcag'cgcgagaagcgcgccgtg 
ggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 

GGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTG . 
GGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTG 

GGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTG 
1321 1360 
ACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCG 

h u tg< v ;atgttcctgggcttcc i 

fi.fr, ATGTTC C CC1 

ACCC I GG 3< A CATGTTCC TGG 5CTTC IGGGCGC :G - CG 
AC( 5CGCCATGTTCCTG( TTC FGGGCGCC 

ACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCG 
ACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCG 
ACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCG 
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GCAGCACCATGGt GC CGCJ CCTGACCCTGA CGTG a 

GC CAC -a xcgcag'cctgao i-gca 

} accat gggcc ;gcag< ctgac accgi 
( - s ? gcgcccgcagcctgaccctgac tgca 

GCAGCACCATGGGrGCOCGCAGCCTGACCCTG^CCGTGCA 
r s AC ATGG GCCCGC \GCCTGZ TGP ' T 
A AC CAT CGCCCGCAC CTG i TGACCGTG 
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iCAGCAGAAC 
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G« : ZCG% 'AG J rGC J GA( 3GGGATCGTGCA( ni :CA 3AA 
GGCCCGCCAGCTGCTGAGCGGCATrGTGCAGCAGCACAAi- 
GGCC CGCCAGCTGCTGAGC GGCATCC TGCAGCAGCAG7 A( 
GGCCCGCCAGCTGC?GAGrGGCA?CGTGCACCACCAGAAC 
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AACCTGCTGCGCG CATCGAGGCCCAGCAGCAC t CTG 
AACCTOCTCCGCCCCATCCAGGCCCAGCAGCACCTGCTGC 
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aacctgc.tgcgcgccatcgagsccc n A ctgctg 
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A( TG7 TGTGGGGC rCAAGCAGi GC? CGCGT 
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AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGT 
AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGT 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGT 
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Vall20-Ile201B (1719) GCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGG£TC 

Vall20-Al a 20 I z T( OGAAGTGGC CAGCCT j» 

Va 1 1 2 0 - : 1 c 2 0 1 ( 1 7 1 9 i GCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTC 

Val 120-Thr2 02 (1719) GCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTC 

Lysl21-Val200 (1725!- GCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTC 

Consensus (1 j TGGACAAGTGGG 1 

1801 184 0 

Vall27-Asnl95 (1801 ) GACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCA 

Vall20-Ile201E G ATCAGCA/P r o CTG1 G ftCS A^o I T r VI 2 

Vall20-Ala204 175 GACATCA a. G GGT.fi TC AAGATCTTCATCA 

Vall20-Ile201 G^r^lCA -"rJ>^- r " CTGTGGTAC&TCAAGATCTTCATCA 

Vall20-Thr202 1 GAC? AA EGT CAT ft Jrt ^< A ft 

I.yc i 2 ] -Vs 1 2 0 :> (1 7 65 ) gacAtcagcaagtggctgtggtacatcaagatcttcatca 

Consensus (1801) 

1841 1880 

Leul22-Serl99 (1811 ; TGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCAC 

Vall27-Asnl95 (1 41 rGATCGTGSGCGGCC?GGTGGGCCTGCGCATCGT rTCAC 

a 20-Iie201E 1 TGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCAC 

Vall20-Ala204 ^ c i) TGATCGTGGGCGGCCiGGTGGGCCTSCGCATCGTGTTCAC 

Vall20-Ile201 ~ TGATCGTGGGCGGCCTGGTGGGCCTGC ATCGTGTTCAC 

Vall20-Thr202 ] — ) TGATCGTGGGCG SCCTGGTGGGCCIGCGCATCGTGTTCAC 

121 i i T< - TGGGCGGCCTGGTGGGCCTGCGCA'TCl 3TTCA 

Consensus (1841) TGATCC"' Tr — T -T( GT< 1 I c -a< 

1881 1920 

Leul22-Serl99 (lcbl CTTGCl^ ~s.GCAICGTGAA< ~^C3T" >..-C r G^iCCI A C 

Vall27-Asnl95 (1881 i CGTGCTGAGCATCGTGAACCGCGTGCGCCAGGG TACAC< 

i -Ile2( 1 CGTGCTGAGCATCGTGAACCGCG2GCGCCAGGGCTACAGC 

al] -Ala 2 0.1 ' 1 8 CGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGC 

Vall20-Ile201 (18 CGTGCTGAGCft'I PGAAC GCGTGCGCC l„ [ACAGC 

Vall20-Thr202 H CTG GC TCG1 W TGCG 3 C / 
Lysl21-Val200 

Consensus (1881) CGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGC 
1921 1960 

Leul22-Serl99 (1891 CCCCTGAGC CCAGACCCGC TCCCCGCCCCCCG GGCC 

Vall27-Asnl95 (1921 j CCS^GAGCTTCCAGACCCGC TTCCCCGCCCCCCGC 5GCC 

Vall20-Ile201B (1879) CCCCIG^GClh'CCAGACCCG.C^XCCCCGCCCCCCGC 5GCC 

Va 1 1 2 0-A1 a 2 0 4 (187 3. CCCCTGAGC? TCCAGACCCGCTTCCCCGCCCCCCGCGGCC 

Vall20-Ile201 (1879) CCCCTGAGCTTCCAGACQCGC^TCCCCGCCCCCC( GGCC C 

Consensus (1921 crrT< A I ( 1 

1961 2000 

Leul22-Serl99 (1931) CCGACCGCC CGAGGGjCATCGAGG G AGC 5( 3GC&AGCG 

Vall27-Asnl95 (1961) ijQGACCGCCCCGAGGGCA'TCGAGGAl^GAGGC CC GCGAGCG 

Vall20-Ile201B (1919) CCGAC) CG CC CC GAG G GCATCG AG G AG< AGGGt GGCGAGCG 

Va 1 1 2 0 - Al a 2 0 -1 J 1 ? 1 3 j CCG<* C^GCCCGG AGGGCATCGAGGAGGAGGGCGC CGRGCG 

a!12 i-Ile2C (191 >) CCGACCGCCCCGAGGGCATCG~AGGAGGAGGGCGGCGAGCG 

Vall2 -Thr202 1919) CCGAQCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCG 

- - al CCGACCC Cl 3AGGGCATCGAGGAGGAGGG GG C?G"" 
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1 1 11 CGA GC \C< CAGCAGC CC IGGTGCAC CTGCTG 

^i 1 7-a >i (2GCJ ;gaccgcgaccgcagcagccccctggtgcacggcctgctg 

11 i i Y C - ftc c , u < v, C A r ^ r 

Vall20-Ala204 (1953) CGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 

Vall20-Ile20) (1959) CGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 
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CGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 
CGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 
CGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 
2041 2080 
GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCA 
GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCA 
GCCCTGATCiGGGACGACCTGCGCAGCCTGTGCCTGTTCA 
GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCA 
3CCCTGAT rGGGACGACCTGCGCAGC CGTGC rGTTCA 
GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCC 3TTCA 
CT.GATCT Z ( A TGT1 

3TGATCT i CTGCGCAGCGTGTGCCTGTTCA 
2081 2120 
( CTACCACCGCC1 ( GCGACC E GAT XTGATCGt CG Ci :CG 
( ACCAC GCCTG( CG?. C 1 ATC ATCG C ZC 
GCTACCACCGCCTGCGCGi CCTGA CCTGATCGCC GCCCG 
GCT,* CCACCGCC^ GCGC GAC Z1 GAT.CC rGA rCG< )CG GCC C 
b TA CA C xgc GATCCTGA 

_T f »C3a pGCCT GCG CTGA1 " rCGCCGCC 
GC r v.z 3 SCC -CGC C rGATCC GATCGO CC 
GCTACCAGCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 
2121 2160 

AT! 3TGGAG — TGGG CGCCGCGGCT JAGG Cfi. 
GATCGTGGAGCTGCXGGGCC CGCGGCTGGC CCCTC 
CAT GTGGA'GC GCTGG 3CGCCGCGGCTGGGAGGCCCTG 
:A7CGTGGAGCTGCTGGGCCG 3GCGGCI 5 "-GGCCCTG 
CAT ;GTG( ViCTGC 3GGCCGCC' < 3GCTGG 5A.GC 3 CT 
CATCGTGGAGCTGCTGGGCCGCCGCGGCTGG3AGGCCCTG 
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2201 2240 
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'\CAGCGCCGTGAGCCTGTTCGACGCC:.TCGCCA1 
ACAGCGCq!3;rGAG-XTGTr"GACGa':;.Tr<-.CCAT 
vCAGCGCCt.TGAG'CCTGTTrGACGCCATCGCCAT 
TGAA^AACAGCGCCGTGr-.GpCTGrTCGACGCCATCGCCAT 
TGAAGAACSGCGCCGTGAGCCT'GTTCGACGCCATCGCCAT 
TGAAG/"iCAGCGCCGTGAGCCTGTTCGACGCCATCGCCAT 
TGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCAT 
2241 2280 
CGCCGTGGC f'^t^ CGACCG hi ATC AGGTGGCC 
.a o CGCAT 3GTGC 

CGCCG GG CAC 3?' :GCAT( ATCG GGTGGC 

CGG3G~GG33GAGGG3A7GGAC3G3ATCA7CGAGGTGGCC 
'X 3GTGG - AGGGCACCGACCGCAT TCGAGGTGGC 
CGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGT 5GCC 



CGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 
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(31) GGGC T CTG GT - ,1 C GTG GGA 

(31) GGGC m ~ T G T TGTG FGCTGC TGTGGi 

(31) GGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 

(31) GGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
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90 



( 61 ) CCRSTCTTCGTTTCGCCCAGCGCCGTGGAG 

( 61 ) GCAGTC1 TC3TTT * 3CCCAGCGCCGTGGAG 

{ 61 ) gcagtct recce: CGCCCAGCGCCGTGGAG 

(61 > < rCTTCG CCCAGCGCCGTGGAG 
(61 ) AGTC1 I :AGCGCC( 

(61) GCAGTCT1CGI GCGCAGU GR 

! 61 ; GCAC TCTTCGTT 

e31 i GCAGTCT1 Tl 3CCACCGCCG1 A( 

( 61 ) G C AG T C T T CG T T TCGCCC AG C G C C G T G GAG 
91 120 

(91) AAGCT6T T G TACGGCGT 

( 9 1 ) A G T j G7l ' ^ G I 

: AA'iC'rGiG'jGT'jACGGT'JTAC TACGGCGIG 

(91) AAGCTGTGG TGAC GTGT CTACGGCG1G 

(91) AAGCIG GGTi r 1' U CGG G 

(91) AAGCTGTo ~ T C^ JTGTACCACGGCGT 

( 9 1 ) AAGCTG2GGGTGA GCGTGTACTACGGCC f~- 

(91) RAG CT TGGGTGAC PACTACGG 1 
A AG C " G T GG GTGACCGTGT AC TACGGCGTG 
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(121) 
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(121) 
(121) 
(121) 
(121) 
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(151) 
(151) 
(151) 
(151) 
(151) 
(151) 
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150 



ZCC& ~ G AGGAGGC ACCACCAC CI 
CCCGTGIGGAAGGAGGCCACCACCACCCTG 
CCCG£GCGGAAGCAGCCC t ACCA CC G 
CCCGTGTGGAAGGAGGCCACCACCACCCTG 

jccgtgtggaagg; sgcca c .ccacc ;? 
cccgtg 1ca-- ggagg ; t cca ; raccc' 
ccc ™ ?tg< ' ?\g( *- gccac c7 x iccct 

CCCGTGTGGAj ; 3 AG G C CACC ACCACCCT G 
CCCGTGTGGAAGGAGGCCACCACCACCCTG 
151 180 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
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Vall20-Ile201-Ile4 24-Ala4 33 (T51) 

Vall20-Ile201B-Ile4 2 4-Ala4 33 (151) 

Consensus (151) 

Leul22-Serl99-Tryp4 27-Gly4 31 (181) 

Vali27-Asnl95-Arg4 2 6-Gly4 31 (181) 

Vall20-Thr202-Ile4 24-Ala4 33 (181) 

Leul22-Serl99-Arg4 2 6-Lys4 32 (181) 

Leul22-Serl99-Arg4 2 6-Gly4 31 (181) 

Lysl21-Val200-Asn4 25-Lys432 (181) 

Vall20-Ile201-Ile4 24-Ala4 33 (181) 

Vall20-Ile201B-Ile4 24-Ala433 (181) 

Consensus (181) 

Leul22-Serl99-Tryp4 27-Gly4 31 (211) 

Vall27-Asnl95-Arg4 2 6-Gly4 31 (211) 

Vall20-Thr202-Ile424-Ala4 33 (211) 

Leul22-Serl99-Arg4 2 6-Lys4 32 (211) 

Leul22-Serl99-Arg4 2 6-Gly4 31 (211) 

Lysl21-Val200-Asn4 25-Lys4 32 (211) 

Vall20-Ile201-Ile424-Ala433 (211) 

Vall20-Ile201B-Ile42 4-Ala4 33 (211) 

Consensus (211) 

Leul22-Serl99-Tryp4 27-Gly4 31 (241) 

Vall27-Asnl95-Arg4 2 6-Gly4 31 (241) 

Vall20-Thr202-Ile4 24-Ala4 33 (241) 

Leul22-Serl99-Arg4 26-Lys4 32 (241) 

Leul22-Serl99-Arg4 26-Gly4 31 (241) 

Lysl21-Val200-Asn425-Lys432 (241) 

Vall20-Ile201-Ile4 24-Ala4 33 (241) 

Vall20-Ile201B-Ile4 2 4-Ala433 (241) 

Consensus (241) 

Leul22-Serl99-Tryp4 27-Gly4 31 (271) 

Vall27-Asnl95-Arg426-Gly431 (271) 

Vall20-Thr202-Ile424-Ala4 33 (271) 

Leul22-Serl99-Arg4 26-Lys4 32 (271) 

Leul22-Serl99-Arg4 26-Gly4 31 (271) 

Lysl21-Val200-Asn4 2S-Lys4 32 (271) 

Vall20-Ile201-Ile4 24-Ala4 33 (271) 

Vall20-Ile201B-Ile4 24-Ala4 33 (271) 

Consensus (271) 

Leul22-Serl99-Tryp427-Gly4 31 (301) 

Vall27-Asnl95-Arg4 26-Gly4 31 (301) 

Vall20-Thr202-Ile4 24-Ala4 33 (301) 

Leul22-Serl99-Arg4 2 6-Lys4 32 (301) 

Leul22-Serl99-Arg4 2 6-Gly4 31 (301) 

Lysl21-Val200-Asn4 25-Lys4 32 (301) 

Vall20-Ile201-Ile424-Ala4 33 (301) 

Vall20-Ile201B-Ile4 24-Ala4 33 (301) 

Consensus (301) 

Leul22-Serl99-Tryp4 27-Gly4 31 (331) 

Vall27~Asnl95-Arg4 2 6-Gly4 31 (331) 

Vall20-Thr202-Ile4 24-Ala433 (331) 
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TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
TTCTGCGCCAGCGACGCCAAGGCCTACGAC 

TTCTGCGCCAGCGACGCCAAGGCCTACGAC 
181 210 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 

CA £ A< ' l GGC( P. 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
ACCGAGGTGCACAACGTGTGGGCCACCCAC 
211 240 
GCCTGCGTGCCCACCGACCCCAACCCCCAG 
G ' rGCGTGCCCACCGACCCCAACCCCCAG 

' C i C f- r 
- cc TGC ~Crc :ACC GA zc C ZP. ACCC A 
GCC GCG1 5CCCP n~ AAC Zl 
GCCTGCGTGCCCACCGACCCCAACCCCCAG 
GCCTGCGTGCCCACCGACCCCAACCCCCAG 
GCCl GCGTC CCCT CC GACCC AAC CCAG 
GCCTGCGTGCCCACCGACCCCAACCCCCAG 
241 270 
GAGATCGTGCTGGAGAACGTGACCGAGAAC 
GAG^'i CJTGCTGGfl G \P.Z ; ™ 3 AC C Z %GAA< 
GAGATCSTGCTGiA.GAAC : ACCGAGAAC 
GAGATC; 3t IGC-A3 a 3TGA 
AG"TCC CT T AA \ 5AGA 
\C ? CC CTGCA jz A 
GAGACCG7GCTGGAGAACGTGAC : iAGAAC 
3AGATCG"GCTG AACGT C 3AGA? 
G AG AT C G C G C T G GAGAAC G T G AC C G AGAAC 



71 



300 



:caaca*gtgga; g j ' \a< atggt'gga* 
. . .aaca: 1 aa. aal a. v. at jgtggag 
■ ■ . -•aacatgtggaag^caacatggtggag 

•I l : AACA rCTGGAAGAAOAACATGC-TGGA., 
TTCAACATGTt AA( A ACATGGTGG? 
TTCAACATGTi AAGAACA iC iT 5GTGG/ 
TCAACS GTGGAAGAAi iCATG< GG/ 
U ACATGTGGAAGAACAA ATGGTGGA 
TCCAACATGTGGAAGAACAACATGGTGGAG 
301 330 
C; 3A£SCACGA'GGACATCATCAC X7TGTGG 
CAGA: SCACGAGGACATCATCAGCCTGTGG 
A ATGCACGAGGAC TC? CAGCP7 TGG 
CAGATGCACGAGGACATCATCAGCCTGTGG 
Zf>. SATGCACGAGGA TCA CAC CTSTCS 
C * 0 a Ao f 

c aigc; :gs ;gaca :atcagc< rci 3 

CAGATGCACGAGGACATCATCAGCCTGTGG 
CAGATGCACGAGGACATCATCAGCCTGTGG 
331 360 
GACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
GACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
GACCAGAGCCTGAAGCCCTGCGTG 
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Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 
Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val2 00-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl2]-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Iie201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly431 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg4 26-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile424-Ala433 
Consensus 



Leul22-Serl99-' 
Vall27-Asnl95- 
Vall20-Thr202 
Leul22-Serl99 
Leul22-Serl99 
Lysl21-Val200 
Vall20-Ile201 

Vall20-Ile201B 



Tryp427-Gly431 
Arg426-Gly431 
Ile424-Ala433 
Arg426-Lys432 
Arg426-Gly431 
Asn425-Lys432 
Ile424-Ala433 
Ile424-Ala433 
Const 



Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 
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(j'i! 5ACCAGAGCCTGAAGC TGCGTGAAGCTG 

( 3 3 1 ) GACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

(331) GACCAGAGCC'XGAAGCCCTGCGTGAA 

(331 ) GACCAGAGCCTGAAGCCCTGCGTG 

(331) GACCAGAGCCTGAAGCCCTGCGTG 

(331 ; ivccag? : 1 ;ccci TGPJ I 

361 390 

(361) GG 

(361) ACCCCCCTGTGCGTGGGGGCAGGGAACTGC 

(355) GG .— 

(361) GG 

(361) GG 

(357) GG 

(355) 

(355) 

(361) GG 



391 



4 2 0 



(363) 
(391) 
(357) 
(363) 
(363) 
(359) 
(355) 
(355) 
(39i; 

(391) 
(421) 
(379) 
(391) 
(391) 
(385) 
(379) 
(379) 
(421! 

(421) 
(451) 
(409) 
(421) 
(421) 
(415) 
(409) 
(409) 
(451) 

(451) 
(481) 
(439) 
(451) 
(451) 
(445) 
(439) 
(439) 
(481) 



A^CAGCG a^i ' 1^ 

AACA'-.CAGCGXGATCACCCAGGCCTGCCCC 

CGGCGC - - -CACCCAGGCCTGCCCC 

GAACAGCGTGAT* At ( CAGGCCTGCC X 
--CAACAGCGTGATCACCCAGGCCTGCCCC 
GC CGTGATCAC CAGGCCTGCCC 

2CG " ' ATCACCCAGGC CTGC X ~ 

XG r ATCACCCAGGCCTGCCCC 

CA CAGCGTGATCACCCAGGCCTGCCCC 
421 450 

AG :qr i ~7 ( f » L>' \ 
AAG STGAGCn 3GAC C " ATCC ( CA fCCA 
AA&GTGAGCTT > hi a TZCI 
rh GTGAGCr CGAGCCCATCCCCATCCAC 
CAXCCCCATCCAC 



AAGGTGAGCTTCGAGCCCATCCCCATCCAC 
451 _____ 480 
CAST -CwG" CCGC G( T( 3C< 4T< ' 

tac: ;c :cccc ggci gccatcci 

Ar-GCGCCC:CGCCGGCTTCGCCA?CCTG 
TCGCCATCCTi 

3GCTTC CCATCCTG 

TACTGCGCCCCCGCCGG XT' X TC T 
481 510 

A TSCM'C A ACT m Ai GG AC 
AAGXC-CAACGACAAGAAGT7CAACGGCAGC 
£ CP ZGAC? GAAGT" AC 
AAGTGCAACG r /-J TT< AAC ' AG 
AA 3 iAC • X AAi T. ,A GGCAG 
AAGT CA T X^ C 'A"!V nXAACGGCAGC 
AAGTGCAJ CGACAAGAAGTTCAACGGCAGC 
/A iACG; T " A TTC7- 3GG i 

AA.GTGCA7 IT 
511 540 
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Leul2Z-Seriyy-Tryp4 27-Gly4 31 
Vall27-Asnl9S-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile424-Ala4 33 



Leul22-Serl99-Tryp427-Gly4 31 
Vall27~Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg426-Lys432 
Lcul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 



Leul22-Serl99-Tryp4 
Vall27-Asnl95-Arg4 
Vall20-Thr202-Il 
Leul22-Serl99-Arg4 
Leul22-Serl 99-Arg4 
Lysl21-Val200-Asn4 
Vall20-Ile201-Ile4 

Vall20-Ile201B-Ile4 



27-Gly431 
26-Gly431 

24- Ala433 
26-Lys432 
26-Gly431 

25- Lys432 
24-Ala433 
24-Ala433 
Consensus 



Leul22-Serl99-Tryp427-Gly4 31 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99-Tryp427-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 
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(571) 
(601) 
(559) 
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(565) 
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(559) 
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(601) 
(631) 
(589) 
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(589) 
(631) 

(631) 
(661) 
(619) 
(631) 
(631) 
(625) 
(619) 
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GGCCCCTGCA CAACG 3S CAC i^ r C 
GGCCCCTGCACCAACGTGAGCACCGTGCAG 
GGCCCCTGCACCAACGTGAGCACCGTGCAG 
GGCCCCTGCACCAACGTGAGCACCGTGCAG 
^GC CCtGCAC ^ ^AC L i CAC 

GGCCCCTGCACCAACGTGAGCACCGTGCAG 
GGCCCCTGCACCAACGTGAGCACCGTGCAG 
GGCCCCTGCA :CAA :GT 3AGCACCG1 G ;AG 
GGCC < i TG< CCGTG A 



541 



570 



TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
TGCACCCACGGCATCCGCCCCGTGGTGAGC 
571 600 
ACCCAGCTGC1 \ AGCCTGG 

GCTGC1 4 GCCTGGX 

ACCCAGCTGCX TTGAAC 3G :AGCCTSGCC 
ACCCAG TGCTG &A GCA T 
^CCCAGCTCCTG TG^.ACGC AGCCTGGCC 
ACCCA CTGCTS: TGAACGC CAGCC1 3GCC 
;cCTGGCC 

;CTGAACGGCAGCCTGGCC 
ACCCAGCTGCTGCTGAACGGCAGCCTGGCC 
601 630 
GAGGAGGGCGTGGTGATCCGCAGCGAG&AC 
r tGGAGGG - - "I TCCGCAGC 3AA 
GAGGAGGGCGT;TTGATCCGCAGCn;.GAAr 
.■;/.Gr;?.GGCrc:-'.TCATCCGCAGCGAG».' 
Cr.CC;.CGGCT"-GATCCGCAGCCA^Ar 



GAGGAGGGCGTGGTGATCCGCAGCGAGAAC 
631 660 
71 CACCG AAC CCAAGACCATCATCGT 
TTCACCGA CAACG CCAA 3ACCA1 CAT CGT C 
TiCiCCGACAACGCCAAGACCATCATCGT^ 
ri CATC C-ACAACC-CCAAGACCATCATCG': 
T-rCACCGACAACGCCAAG^CCATCATCGTG 

; rc ccc-ac a v cc :aa gacc atc m cgt 



TTCACCGACAACGCCAAGACCATCATCGTG 
661 690 
CAGC TGAAGGAGAGCGTGGAGATCAACTGC 
CAGCT 3AA ; CA 3AGCGTGGAG? TCAACXG C 
CAGCT3AAGGA3AGCG'i'GGAGATCA-.CT'.:' 
CAGCT'GAAGGAGAGCGTGGAGATCAACrGC 
CAGCTGAAGGAGAGCGTGGAGATCAACTGC 
CAGCTGAAGGAGAGCGTGGAGATCAACTGC 
TAG' TGA? : 3AGAO 3TGGAGATCAAC TGC 



FIG. 5D 



WO 00/39303 



Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-VaI200-Asn4 2 5-Lys4 32 
Vall20-Ile201-Ile4 2 4-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 2 4-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 2 4-Ala4 33 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 2 4-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly431 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile4 24-Ala433 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99-Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly431 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg4 26-Lys432 
Leul22-Serl99-Arg4 2 6-Gly431 
Lysl21-Val200-Asn4 2 5-Lys432 
Vall20-Ile201-Ile4 24-Ala433 

Vall20-Ile201B-Ile42 4-Ala433 
Consensus 

Leul22-Serl99-Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly431 
Vall20-Thr202-Ile42 4-Ala433 
Leul22-Serl 99-Arg42 6-Lys4 32 
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CAGCrGAAGGAGAGCGTGGAGATCAAGTGC 

C A G C T G AA G G A G A G C G T G G A G A T C AA C T G C 



691 



720 



ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 

GCCCCAAC? « 3A 

ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 
ACCCGCCCCAACAACAACACCCGCAAGAGC 
721 750 
ATCACCATCGG C< 5< O C& ( TTCT ^< 
AT ACCATCGGC 2C CG "T T 

ATCACCATCGGCCCO 'C CGCGC :?TCT/ . 
} hCCMC CCCCGGCCGCGC TTCT/ 
ATCACCATCGGCCCCGi CCGCGC TTCTAC 



ATCACCATCGGCCCCGGCCGCGCCTTCTAC 
ATCACCATCGGCCCCGGCCGCGCCTTCTAC 

ATCACCATCGGCCCCGGCCGCGCCTTCTAC 
751 780 
GCCACCGGCGACATCATCGGCGACATCCGC 
- ~ GGC 5ACATC TC GC( \C T^CGC 

caccggc r , ;gc i r - c 
c caccggcg->cai tcggcgacatccgc 

VCC CG AT TCGGCGAC CCGC 

~. CCACC q \cat : fcggcg; ;cgc 

: \< ( 3J ACATCATC GCGA ATCCG 
GO ACCGG< ACATCATCG o i - i 1 5( 
GC C T VlC GGC GA - ATCATC GGC GACATC C GC 
781 810 
CAGG'CCCACTGCAACATCAGCGGCGAGAAG 
CAGGCC ACT CCAAC TCP GCGGCG? 3AAG 
'-:/M-GCCCACTGC;^CAT.-AGCGGCGAGAAC 
CAGCCCCArTCICV^C.-.TCAGCGGCGAGAAC 
CAGGCCCAC7GCAAC7.T CA3CGG CGAGAAG 
CAGGCCCACTGCAACA?CAGCGGCGAGAA~ 



AGGCCC : - P AT * GGC t. "A 

CAGGCCCACTGCAACATCAGCGGCGAGAAG 



811 



84 C 



GGAACAACAC TGAA< I G1 a^C 

T 3GAACAACAC ; : 1 GAA GCAGAT CGT GAC ( : 

G« AA£ ACACCC A CAt AT 
TGGAACSACACCCTGAAt CAGATCGT ACC 
' GGAACAS ;A :CT 3 A? GC AG AT; G? 'ACC 
T( ;AACA '- r -.G CTGAAG '.AGP TCGTG AC 
TGGAACA? :ACCCTGAAGCAGATCGt'gACC 
TGGAAC_AACACCCTGAAGCAGATCGTGA.ee 
TGGAACAACACCCTGAAGCAGATCGTGACC 
841 870 
AAGCTGCAGGCCCAGTTCGGCAACAAGACC 
AAG CT ^AGGCCCA -1 TCGGCAACAAGACC 
V GCTGCA< GO V . ; j GCAACAAGA 
AAGCTGCAGGCCCAGTTCGGCAACAAGACC 
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Leui22-Ser1 99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 2 5-Lys4 32 
Vall20-Ile201-Ile424-Ala433 
Vall20-Ile201B-Ile4 24-Ala4 33 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl 95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 
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Leul22-: 
Vall27- 
Vall20' 
Leul22- 
Leul22- 
Lysl21 
Vall20- 

Vall20- 



Seri99-Tryp4 27-Gly4 31 
Asnl95-Arg4 26-Gly4 31 
Thr202-Ile424-Ala433 
Serl99-Arg4 2 6-Lys4 32 
Serl99-Arg4 2 6-Gly4 31 
-Val200-Asn425-Lys432 
Ile201-Ile424-Ala433 
Ile201B-Ile424-Ala433 
Consensus 



Leul22-Serl99-Tryp4 27-Gly431 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Val 120-1 Ie201-Ile424-Ala433 

Vall20-Ile201B-Ile424-Ala4 33 
Consensus 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22~Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 

Leul22-Serl99-Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 2 4-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl2] ~Val200-Asn425-Lys432 
Vall20-Ile201-Ile4 2 4-Ala4 33 

Vall20-Ile201B~Ile4 24-Ala4 33 
Cons( 



Leul22-Serl99-Tryp427-Gly4 31 



(841) 
(871) 
(829) 
(841) 
(841) 
(835) 
(829) 
(829) 
(871) 

(871) 
(901! 
(859) 
(871) 
(871) 
(865) 
(859) 
(859) 
!90i) 

(901) 
1931) 
(889) 
(901) 
(901) 
(895) 
(889) 
(889) 
(931) 



AAGCTGCAGGCCCAGTTCGGCAACAAGACC 
aagci c :aggc :cagt TCGG 2P ACAAGAC ~ 
\A< CTGCAGC 5cCAGTTCC 5CAACAAGAC 
\I CTGCAC 3 CACTI 5C AACAAGA 

AAGCTGCAGGCCCAGTTCGGCAACAAGACC 
871 900 
ATCGTGTTCAAGCAGAG ;AGCG 5CGGCGAC 
AT 3TGTTC - AGP. G G □ Z 
-I 3TG TC J CI G C!> f u ( 
ATCGTGTTCAAGCA 5AGCAGC C ;CGG X3AC 
I rGTGTT AA AGAGCAGC SGCC - 
G rTCA? AGAGCA _ C G-f 
ATCGTGTTCAAGC VGA :AG< ( CGG A 
ATCGTGr-CAAGCAGA Ci ' GG GG GAi 
ATCGTGTTC AA G C A G A G C A G C G G C G G C G A C 
901 930 

CGAG? T :GI ATGCACT 3 Z1 T< I AC" 1 , 
C CC GAGATC GTGAT G CAC AGCT TCAACTGC 
CCCGAGATCGTGATGCACAGCTTCAACTGC 

Ii" ~' ( / CI 

CCCGAGATCGTGATGCACAGCTTCAACTGC 
CCCGAGATCGT , rGCACAGCTTCAACl SC 
~ AC r TGATGC .CAGCT 
3 3AGATCG! - 1 i ^ _ 

CCCGAGATCGTGATGCACAGCTTCAACTGC 
931 960 
TTC1 ~~ / "G~AC 
".TACTGCAACA CACC 
C7? ' AACAGCAC 

;tscaacac 
cttcta v gag acc 
' cg< cgagt ret rc~ a : & 4 a ? scac 

- GCC ASJT£T,TC ACTSCA CAGCAC 

, rcf qccag^tct rci f : ^gcaagag v 

GGCGGCGAGT f i 'IT ^ " " A f AG 
961 990 
CAGCTSy^CASCJ — g» ACAACAC 

cag : : st : caa. gag a ~ - r ::a • caacacc 

SCTG'TTCi A&CACCT M 

:agcacctgga? acac 

GCr.GXTCAA AGC/ ( G AAC ACAO 

:ctg( 





(961) 
(991) 
(949) 
(961) 
(961) 
(955) 
(949) 
(949) 
(991) 



A-CGGCCCCAACAACACCAACGGC ACCAT . 
A T C G G C C C C.--AC AA CACC AACGGC ACCAT C 

at cgg gcc caac aacaccaacggc acc at c 
atcggccccaacaagaccaacggcaccatc 
atcg.:;c< • . -aacaagaccaacggcaccatc 



ATCGGCCCCAACAACACCAACGGCACGATC 
1021 1050 
ACCCTGCCCTGCCGCATCAAGCAGATCATC 
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Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 2 4-Ala4 33 
Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys432 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 2 4-Ala4 33 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly4 31 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg426-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-lle424-Ala433 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27 -Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 



(1C2-; ACCCTGCC rGCCGCATCAAGCAGATCATC 

(979) ACCCTG fGC GCATCAAGCAGATCATC 

(991) ACCCT 1.. DTGCCGCATCAA'GCAGAT \ T 

(991) ACCCTGCCCaGCCGCATCAAGCAG&TCATC 

[985) ACCCTGCCCTGCCGCAT AAG AG&TCA1 

(979) ACCCTGCCC7GCCGCATCAAGCAGATCATC 

; 9 ■' y ; accctgccctgccgcatcaagcagatcatc 

(1C21) ACCCTGCCCTGCCGCATCAAGCAGATCATC 
1051 1080 

102: ) AACCGCTGGGGCGGCAAGGCCATGTACGCC 

iOSI) AACCGC 3 GGCGG AAGGCCATGTACGCf 

:0D9) GGCGGC— GCCATGTACGO 

:o2i ) Ar :gc - tc aag catgt . ;gcc 

] aaccgc c gcgccapgc , - TA.yr 

1 015 ) AAC G 'CO CAAGGCCATGTACGCC 

1009) GGCGG'- -GCCATGTACGCC 

1009) Go' GGC-- 

10 51) AACCGC 



1051) 
1081) 
1027) 
1051) 
1051) 
1039) 

102'; > 

1027) 
1081) 



CCCCCCATOCGCGG CAGATCCG T 3 :AGC 
:C( :CCAT,CCGCGGCCAGATC GCIGCAGc 
CC ZC \1 CGC( CCA fT( CGC1 30* " 
CC GCGG 2) ' iVTCi 3( T AG 

CCCCCCATCCGCC CCAGATCCC TGCACC 



t CC CGCGGCCAG GCT A C 
( < if, i , ' / g 



agcaacatcjSccg 3 :r GCTGCTGACCCGC 

AGCAACATCACCGGCCTGCTGCTGACCCGC 
1141 1170 
GP JGSCGGC AA. ft ATC 4 G< -AC? XP ZC 



1081 i CAACATO G GCT 3A( ( 
1111) AGCA^OYTCArrr r TGCTGGTG CD G 

i o 5 7 ) agcaac^tcacc; r cct ;ci < :cv iacccgc 

■ ■ . ACCr.ACATCACCGGCr.TGCT -CTGACCCGC- 

1081) A'GCAACATr ACCG ? rCGOGCTG hCC X C 

1060, AGCAACATCACCGGCCTGC^CTGA< CCGC 

1057) ~~ 

1057) 
1111) 

1111) 
1141) 
1087) 
1111) 
1111) 
1 000) 
1087) 
1087) 
1141) 

1141) 
1171) 
1117) 
1141) 
1141) 
1129) 
1117) 
1117) 



GACG^CGSCAAGGAGA . AGCAACACCACC 

ga::ggcggcaaggagat(:agc;^c.-.C':ato 
gaCggcgCcaag ^agatcaCCaacaccacv 
gacggcggcaaggag; 



GACGGCGGCAAGGAGATCAGCAACACCACC 
1171 1200 

gAgatcttccgccccggcggcggcgacatg 
gag at c t t t c.g c c ccggcggcggcgacat g 
gag.v:cttccc-cc;cggcggcggcgacatg 

GAGATCTTCCGCCCCGGCGGCGGCGACATG 
GAGATCTTCCGCCCCGGCGGCGGCGACATG 
GAGATCTTCCGCCCCGGCGGCGGCGACATG 
GAGATCTT — .,0 GGCGGCGACAT 
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Consensus (1171) 



Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile424-Ala433 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Cons 



(1147) 
(1147) 
(1201) 

(1201.) 
(1231) 
(1177) 

(i2ci: 

(12C1) 
(1189) 
(1177) 
(1177) 
(1231) 

(123: ; 
(126: ; 

(1207) 
(1231) 
(1231) 
(1219) 
(1207) 
(1207) 
(1261) 

(1261) 
(1291) 
(1237) 
(1261) 
(1261) 
(1249) 
(1237) 
(1237) 
(1291) 




- kc a£gg? 3<" r :; c - rc< agc. czgcc 

'1 ACAAGGTgGTGAAGATCGAGCCCCTGGGC 
1261 1290 
JTGGCCCCCACCAAG -CAAGCGCCGCGT J 
C-GGGGCCCACCAAGGCCAAGCGCCGCGTg 

::- - - :gagcaa -• • • :.\a.vg?c ;cgt ■ 
GTGG CCCC& C . C CCA, SGO GC rc 

g ; ggco ; r A - ivvsoc : vagcgccg 

GAG^GGGCAOCAA" GJCAACCCCGCCC? ' 
GTGGC CCcicCAAGGCG .gcgccgcst: 
stgsg c x 9 _ ;a; s < gc z \ u g : ;cst 

GTGGCCCCCACCAAGGCCAAGCGCCGCGTC- 
1291 1320 
iTCCP&CQC&h 3AAG "*G . CGTG£ CC1 
GTG^Anf^CGAGAAGCGCGCCGTGACCCT - 

CTGcr,GCno-;. -AG' ;•?: ;.ccg?gaccca 

■STGC^GCGCGAGAAGC;. 'GrO.?TGACCC*'-.' 



Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 



G £ C GCGCJ^^AGGG / CG 'CACC " 
GTGCAGCGCGAGAAGCGCGGCGtGACCCTG 
51 MSG ;$C<3AGAA6< G ? * XT «CCV 

GTGCAGCGCGAGAAGCGCGCCGTCACCCTC 
1321 " 1350 




GGCGCCATGTTCCTGGGCTTCCTGGGCGCC 
1351 1380 
G X 3<5C/ iw 3CGC CSO s 

:aigsgcgcgggcag"_: ; 



F/G. 5H 
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Lysl21-Val200-Asn42 5-Lys4 32 
Vall20-Ile201-Ile42 4-Ala4 33 
Vall20-Ile201B-Ile42 4-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly4 31 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile42 4-Ala4 33 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn4 2 5-Lys4 32 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile42 4-Ala4 33 

Leul22-Serl99 Tryp427-Gly4 3 1 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys432 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn42 5-Lys4 32 
Vall20-Ile201-Ile42 4-Ala4 33 

Vall20-Ile201B-Ile424-Ala433 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile424-Ala433 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly431 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg4 2 6-Lys432 
Leul22-Serl99-Arg426-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile424-Ala4 33 
Consensus 



Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 



36 / 

(1309) 
(1297) 
(1297) 
(1351) 

(1351) 
(1381) 
(1327) 
(1351) 
(1351) 
(1339) 
(1327) 
(1327) 
(1381) 

(1381) 
(1411) 
(1357) 

(1381) 
(1369) 
(1357) 
(1357) 
(1411) 

(14.1) 
(1441) 
(1387) 
(1411) 
(1411) 
(1399) 
(1387) 
(1387) 
(1441) 

(1441) 
(1471) 
(1417) 
(1441) 
(1441) 
(1429) 
(1417) 
(1417) 
(1471) 

(1471) 
(1501) 
(1447) 
(1471) 
(1471) 
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GCCGGCAGCACCATGGGCGCCCGCAGCCTG 
GCCGGCAGCACCATGGGCGCCCGCAGCCTG 
GCCGGCAGCACCATGGGCGCCCGCAGCCTG 
GCCGGCAGCACCATGGGCGCCCGCAGCCTG 
1381 1410 
A( CCCGAC< GTGC AGG \C JCCAC ;TG :i . 
ftC< I AGGCCCG :CAG( t. 
GTGCAGGCCCGO 1 CT 
ACCCTGACCGTGCAGGCCCGCCAGCTGCTG 
ACCCT GACCGTGCAGGCCCGCCAGCTQCTG 
ACCCTGACCGTGCAGGCCCGCCAGCTGCTG 
A Z 7GACCGTGCAGGC :GCC? 1TGC 

CTGACCG 1 r GC TGCi 
ACCCTGACCG1 - C AGCTGCTG 

1411 1440 
AGCGGCATCGTGCAGCAGCAGAACAACCTG 
a.GCG< CATC G CAGAAC? - 

AGCGGCA7CGTGCAGCAGCAGAACAACC1 - 
A< j ATCGTGC 3AGCAI ACAACI 
AGCGGCATCGTGCAGCAGCAGAACAACCTG 
s > CG ,~< G CAGCAc ' ; 

AGCGGCATC 

1441 1470 
( GCGCGCCATC( - CCAGCAG 1 J 

' \, : *, r :catCga ; c :ag< ? % \ 7. 

< ' 3CGCG CATCGAGGC SGCAGCACCTG 
CT 3 3CG C* 3A GCC ^G ? ~, ~T 
C ~> - CCA CGi iCCCAGt-; 1< " G 
OCGCOCGCCA1 AGGCCCAGCAGf C 

lGC GCi : TG 



CTGCAGCTC^CCGTGTGGGGCATCA/iGCAG 
CTGCAGCTGACCGTGTGGGGCATt JVC _A 
CTGCAGCTGACCGTGTGGGGCATCAAGCAG 
CTG » >C GAC :G1 ' ■ fGGGGCACCAP GCP 
C7GC£ 3< t-.ACCbT 1 GGGGC ACCAAGC fi 



C JGCA! CTGA C rGTGG X 1 C 

GCTl rGT( ( ' V 

1501 1530 
riC^CC uCGCGTGCTGGCCG GGAGCGi 
CTGCAGGCCCGCGT GCTGGCCG: GGAGCG G 

ctgcaggcccgcgtgctggccgtggagcg:. 

CTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

ctgcaggcccgcgtgctggccgtggagcgc 



(1447) CTGC f r; CGCGTGCT 3C TG AGCC 
(14 4 CI GCAGGCCCC CGTG ?T 3GCCGTGGAGC G ' 
(I'm i c — r t n,( crcTi 

1531 1560 
(15C1) TACT - GG? - 'Tj r CTG< iTC 
(1531) TACC" 1 i RCCAG ( T T G tlx. 
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Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile424-Ala4 33 
Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile424~Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-lle201-Ile424-Ala433 

Vall20-Ile201B-lle424-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly4 31 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 2 4-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly43i 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg42 6-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile42 4-Ala433 

Vall20-Ile201B-Ile4 2 4-Ala433 
Consensus 
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ACCGC CG 1 i "AA< Gi « z\m T( ( A3C 

1621 1650 
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ACAAGAGCCG A3CAGATC3 GAA A3 
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Leul22-Serl99 Tryp427-Gly4 31 (1681) 

Vall27-Asnl95-Arg4 2 6-Gly4 31 (1711) 

Vall20-Thr202-Ile424-Ala4 33 (1657) 

Leul22-Serl99-Arg426-Lys4 32 (1681) 

Leul22-Serl99-Arg42 6-Gly4 31 (1681) 

Lysl21-Val200-Asn425-Lys4 32 (1669) 

Vall20-Ile201-Ile42 4-Ala4 33 (1657) 

Vall20-lle201B-Ile424-Ala4 33 (1657) 

Consensus (1711) 

Leul22-Serl99 Tryp4 27-Gly4 31 (1711) 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 
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Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg42 6-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 2 4-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 2 6-Gly4 31 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg4 2 6-Lys432 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 25-Lys4 32 
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ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
A" C "A A AGCCAG.Z iCCAGC^ 37 5AA 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
ATCGAGGAGAGCCAGAACCAGCAGGAGAAG 
1741 1~770 
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G2 ,C" AG( 1 X i 3 AG CI GA( f G 
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AACGA CA GAGCTGC GGAG TGGACAA 
AACGAGCAGGAGCTGCTGGAGCTGGACAAG 
1771 1800 
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GGGCC CCTG7G A< TGi 3S S 
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TGG CAGCCTGT GAACTGG 1 CGA.CA1 
1' G G G C C A G C C T G T G G AAC T G G T T C G AC AT C 
1801 1830 
' TGGTAC CAAGA1 :TC 
TACAT - t- X", 
'GGTACATCAA0ATC77C 
'GGTACATCAAGATCTTC 
'GTGGTACATCAAGATGTCC 
■GTGCTACATCAAGATCTTC 
AGCAAGTGGCTCTCCTACATCAAGATCTTC 
fAGCAAGTG CT TGGT " AT ACS CTTC 
A G C AA G T G G C T G T G G T A C A T C AA G A. T C T T C 
1831 1860 
M ATG? , "~ GG GC T CGGGC Tu 
ATCATGATCGT GGGCGGCCTGGTGGGCCTG 
Al'CA'l'GATCjGTGGGCGGCC: 3GCGGGCCTG 
ATCATGA7CGTGGGCGGCC:'GG:.G3GCCTG 
ATCATGATCGTG'3GCGGCCTGGTG -GCCTG 
A?-A?GAT.CG7GGGCGGC:TGGTG5GC.CTG 
ATCATGATCGTGGGCGCr.CTGGTGGGCCTG 
A T CAT G AT CGT G G G C G G C C7 G G 7G G G CC.T G 
\TCATGATCGTG CCCT T( 

1861 1890 

- VTCC] 3TT 3AGCATCG 
CGCATCGTG77CACCGTGCTGAGCATCGTG 
CGCATCGTGTTCACCGTGCTGAGCATCGTG 
CGCATCG7GT.TCACCGTGCTGAGCATCGTG 
CGCATCGTGTTCACCGTGCTGAGCATCGTG 
r 3ATCGT I o - GTG AGCA CCfG 
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Vall20-Ile201-Ile4 24-Ala4 33 (1807) 

Vall20-Ile201B-Ile4 24-Ala4 33 (1807) 

Consensus (1861) 

Leul22-Serl99 Tryp427-Gly431 (1861) 

Vall27-Asnl95-Arg4 2 6-Gly4 3i (18 91) 

Vall20-Thr202-Ile4 24-Ala4 33 (1837) 

Leul22-Serl99-Arg4 2 6-Lys432 (18 61) 

Leul22-Serl99-Arg4 2 6-Gly4 31 (18 61) 

Lysl21-Val200-Asn425-Lys4 32 (184 9) 

Vall20-Ile201-Ile42 4-Ala4 3 3 (1837) 

Vall20-Ile201B-Ile4 24-Ala4 33 (1837) 

Consensus (1891) 

Leul22-Serl99 Tr yp4 27-Gly4 3 1 (1891) 

Vall27-Asnl95-Arg4 2 6-Gly4 31 (1921) 

Vall20-Thr202-Ile4 24-Ala4 33 (18 67) 

Leul22-Serl9 9-Arg4 2 6-Lys4 32 (18 91) 

Leul22-Serl99-Arg426-Gly4 3i (1891 ) 

Lysl21-Val200-Asn4 25-Lys4 32 (187 9) 

Vall20-Ile201-Ile4 24-Ala4 33 (1867) 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 



CGCATCGTSTTCACCGTGCTGAGCATCG7G 
f C TCGTGTTCA CGT GTGAGC/ 3T 

CGCATCGTGI"- , 

1891 1920 
AACCGCGTGCC ACAGCCC 21 

AACCi ZGTGC ;AC GGCTACAGCC GCTG 
AACCGCC-TGnS Z -W^5CTACAGCCCCCTG 
AAC GCGTGCGCCAG&GCTACAGCCCCC' 0 
AAC " rGCGCC G CTAC? 3Cl c ; 
AACCGCGTGCGCCAGGGCTACAGCCCCCTG 
AAC CGCCTGCGCCAGGGCTACAGCCCCCY 3 



Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala433 

Vall20-Ile201B-Ile4 24-Ala433 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile424-Ala433 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 24-Ala4 33 
Cons 



AACCGCGTGCGCCAGG 5CTAC/ :CC! 
1921 1950 
AGCTTCCAGACa CCGCGC ;CCCG 

AO ttccagac : : : ;gcc:cccgc 

G TTCCftl ACCCG GCCC 5CC C< ;GC 
AGCTTCC? 3ACCC C X XGCC ;C€CGC 
AGCTTCCAGACCCGCTTCCCCGCCCCCCyc 
A&CTICCAGAGCGGCTT* CCCG< CCC< CG 
I GCTGCCAGACCCG^TTCCCCGC( XCCGC 
\GC1 CCAGACCCGCTTC( ~CGC C CCG( 

A A "-Tin 'G - 
1951 i960 
GGCCC CGACCGCCCC; AGGGCATCGAGG2 3 



&sss 

g.^c/xgaccg: 

ggccccgaccgccccgagggcatcgaggag 

1981 2010 

gaGggcggcgaggg; 



Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 2 6-Gly431 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile4 24-Ala433 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg42 6-Gly4 31 
Vall20-Thr202-Ile424-Ala4 33 
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AGCCCCCTGGTGCACGGCCTGCTGGCCCTG 
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Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Val 120-1 Ie201-Ile424-Ala4 33 
Val 120-1 le201B-Ile424-Ala4 33 
Consensus 



Leul22-Serl99 
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Vall27 
Vall20- 
Leul22 
Leul22 
Lysl21- 
Vall20 
Vall20' 
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Val200- 
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Tryp427-Gly431 
Arg426-Gly431 
Ile424-Ala433 
Arg426-Lys432 
-Arg426-Gly431 
-Asn425-Lys432 
Ile424-Ala433 
Ile424-Ala433 
Consensus 
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Con: 



Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 26-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 24-Ala433 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 24-Ala433 
Leul22-Serl99-Arg426-Lys432 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala433 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp4 27-Gly4 31 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 
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CTGATJGCCGCCCGCA.TCGt GjAGCTGCTG 
CTGATCCCCGCOCGCA7CC7GGAGCTGCTG 
C7GATCGGCGCC-GCATCCTCGAGCTGCTG 
CrSA^TCGCCGCC >*C - 3AGC >d 
_T ACCGCC 2GCAT rGG ~ rr ~ G 
C: GATCGCCGCCCGCATCGTGGAGCTGCTG 
2131 2160 

GG 3GCCCTGAAGTA 
- - :VGCGGCTGGGAGGCCC?GAAGTAC 
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TGGGGCAHe.CTGC I G * AG1 AC! GGATCCAG 
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^'GGGGCAJiCCTGCIGC'/GTACTGGATCCAG 

tgg3GCaacc:gc:gcag-actggatccag 

TGGGG^ACC:GC7GCAG-;ACiGGAT( 
TGSGGCAACXIGCTGCAGTACTt 



TGGG-CAACCTGCTGCAGTACTGGATC 

TGGGGCAACCTGCTGCAGTACTGGATCCAG 
2191 2220 
GAGCTGAAGAACAGCGCCGTGAGCCTGTTC 



GAGC? 3AAGAA1 AGCGCCGTGAGCCTGTJ'C 
GAGCTGAAGAACAGCGCCGTGAGCCTGTTC 
GAGCTGAAGAACAGCGCCGTGAGCCT3TTC 
GAGCTGAAGAACAGCGCCGTGAGCCTGTTC 

GAGCTGAAGA/- I AGCCTGTTC 

2221 2250 
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Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl9S-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 2 4-Ala4 33 
Leul22-Serl99-Arg4 26-Lys432 
Leul22-Serl99-Arg42 6-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg4 26-Gly4 31 
Vall20-Thr202-Ile4 24-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 2 6-Gly4 31 
Lysl21-Val200-Asn4 2 5-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly431 
Vall27-Asnl95-Arg426-Gly4 31 
Vall20-Thr202-Ile4 2 4-Ala4 33 
Leul22-Serl99-Arg4 2 6-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 

Leul22-Serl99 Tryp427-Gly4 31 
Vall27-Asnl95-Arg426-Gly431 
Vall20-Thr202-Ile424-Ala4 33 
Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg426-Gly431 
Lysl21-Val200-Asn425-Lys432 
Vall20-Ile201-Ile424-Ala4 33 

Vall20-Ile201B-Ile4 24-Ala4 33 
Consensus 
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Serl99 Tryp427-Gly431 
•Asnl95-Arg426-Gly431 
•Thr202-Ile424-Ala433 
-Serl99-Arg426-Lys4 32 
■Serl99-Arg426-Gly431 
Val200-Asn4 25-Lys4 32 
Ile201-Ile424-Ala433 
Ile201B-Ile4 24-Ala4 33 
Consensus 
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SEQ ID NO:3 VAL120-ALA204 

GAATTCGCCACCATGQATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCGCCTGCCCCAA 

GGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTG 

CAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCC 

ACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGC 

GTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGA 

GAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCC 

CCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACA 

TCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTC 

GGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAG 

CTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAA 

CAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGA 

TCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATC 

CGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAA 

CACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCC 

GCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGT 

GCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCG 

GCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCT 

ACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCA 

TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 

GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTG 

ATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 

TGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGA 

CGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCG 

GCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAAC 

TCGAG 



FIG. 6 



WO 00/39303 43 ! 6g PCT/US99/31272 

SEQ ID NO:4 VAL120-ILE201 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCG 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA.CAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 
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WO 00/39303 44 j 65 PCT/US99/31272 

SEQ ID NO:5 VAL120-ILE201B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCAGTCTTCG 
TTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCA 
CCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGCCACCC 
ACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACA 
TGTGGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGC 
CCTGCGTGCCCGGCATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGC 
CCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGT 
GAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCT 
GGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 
GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCC 
CGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGC 
GAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 
GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTC 
TTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 
GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 
TACGCCCCCCCCATCCGCGGCCAGATCCGCI'GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 
GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGC 
GCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGC 
GCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGC 
CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGT 
GCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGG 
CATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCAT 
CTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 
CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCT 
GATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 
ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 
GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 
GGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCGCCGAGGGCATCG 
AGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCT 
GGGACGACCTGCGCAGCCTGTGCCTGTrCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 
CATCG1GGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTG 
GATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCAC 
CGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAG 
GGCTTCGAGCGCGCCCTGCTGTAACTCGAGCGTGCT 
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WO 00/39303 



45 / 65 



PCT/US99/31272 



SEQ ID NO:6 LYS121-VAL200 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCGCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCITCCTGCACATCCCCCGCCGCATCCGCCAGGGCITCGAGCGCGCC 

CTGCTGTAACTCGAGCGTGCT 
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PCT/US99/31272 



SEQ ID NO:7: LEU122-SER199 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCC 

CCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 

GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAA 

CTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCA 

CCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTC 

CTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 

CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGG 

CCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTG 

GAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACA 

CCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGA 

CAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTT 

CATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAA 

CCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCC 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCC 

CTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTAC 

CACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGC 

TGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAG 

CGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGA 

GGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGA 

GCGCGCCCTGCTGTAACTCGAGCGTGCT 
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WO 00/39303 . r _ PCT/US99/31272 

47 / 65 

SEQIDNO:8 VAL120-THR202 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA'I'CGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCG 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 
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WO 00/39303 



48 / 65 



PCT/US99/31272 



SEQ ID NO:9 TRP427-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCT 

GGGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATC 

ACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCG 

CCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGA 

AGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 

CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGC 

GCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCA 

GAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCA 

TCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTG 

GGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTG 

GAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAG 

ATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAA 

GAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCA 

GCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCA 

TCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCC 

AGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGC 

GAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGA 

CCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 

CATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGC 

AGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCC 

GTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCA 

CATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 12 



WO 00/39303 



49 / 65 



PCT/US99/31272 



SEQ ED NO:10 ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

QGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:ll ARG426-GLY431B 

GAATTCGCCACCATGGATQCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGQAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCAGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:12 ARG426-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTQAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCAACAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCC'rGACCCTGACCGTGCAGGCCCGCCAGCTGClGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGC'IGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATC1GCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:13 ASN425-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACGCCC 

CCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCC 

TGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGC 

GGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGA 

GCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCG 

TGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCA 

GCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAAC 

CTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCA 

GCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCT 

GGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAAC 

AAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAA 

CTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGC 

AGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGG 

CTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTC 

ACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGC 

TTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGA 

CCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAG 

CCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGA 

GCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGA 

TCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAG 

GGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGC 

CGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO: 14 ILE424-ALA433 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCGGCGGC 

GCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTG 

CTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGG 

CGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACC 

CTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTG 

ACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCT 

GCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGC 

AGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGC 

TGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACA 

CCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGA 

GCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGT 

GGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCG 

TGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCC 

CCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGC 

GACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTG 

TGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTG 

CTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCA 

GGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCA 

CCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCA 

TCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:15 BLE423-MET434 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCGGCGGCATG 

TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACC 

CGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACAT 

GCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCG 

TGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGC 

GCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTG 

ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGC 

CATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCC 

GCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 

GGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGA 

CCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACC 

TGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 

GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACAT 

CAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAG 

CATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCC 

CCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGC 

AGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTG 

TTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGC 

CGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCT 

GAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACC 

GCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCC 

AGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 18 



WO 00/39303 



55 / 65 



PCT/US99/31272 



SEQ ID NO:16 GLN422-TYR435 

GAATTCGCCACCATGGATGCAATGAAQAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTQCGTGAAGCTGACCCCCCXGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGGCGGCTACGCC 

CCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGAC 

GGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCC 

CCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATG 

TTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTG 

CAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 

GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 

TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAG 

CTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGAT 

CTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCT 

ACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCT 

GGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGA 

TCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCG 

TGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCG 

GCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAG 

CCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAG 

CTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCG 

CGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGA 

ACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATC 

ATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGC 

TTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 19 



WO 00/39303 56 f 65 PCT/US99/31272 

SEQ ID NO:17 GLN422-TYR435B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGCCCCCTACGCCC 

CCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 

AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCC 

CACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGT 

TCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGC 

AGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 

GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCT 

GGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATC 

TGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTA 

CACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 

GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGAT 

CTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGT 

GAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGG 

CCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGC 

CCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGC 

TACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGC 

GGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA 

CAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCAT 

CGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTT 

CGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 20 



WO 00/39303 __ . , K PCT/US99/31272 

57 / 65 

SEQ ID NO:18: LEU122-SER199; ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAQAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 21 



WO 00/39303 , PCT/US99/31272 

58 / 65 

SEQ ID NO:19 LEU122-SER199; ARG426-LYS432 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCAACAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGG1TCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 22 



WO 00/39303 59 / 65 PCT/US99/31272 

SEQ ID NO: 20: LEU122-SER199; TRP427-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 23 



WO 00/39303 „ . PCT/US99/31272 

6U / 65 

SEQ ID NO:21 LYS121-VAL200; ASN425-LYS432 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACGCCCCCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCG 

CTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACA 

CCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTAC 

AAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGT 

GGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGC 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCG 

GCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAG 

CTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAA 

GGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGC 

CCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATG 

GAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCA 

GAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGITCGACATCAGCAAGTGGCTGTGGTACATCAAGATCITCATCA'I'GATCGTGGGCGGC 

CTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTAC 

AGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATC 

GAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGC 

CCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGAT 

CCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAG'IACTG 

GGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 

CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 

CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTC 

GAG 



FIG. 24 



WO 00/39303 6J 7 65 PCT/US99/31272 

SEQ ID NO:22 VAL120-ILE201; ILE 424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCI'GGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 n , rB PCT/US99/31272 

62 / 65 

SEQ ID NO:23: VAL120-ILE201B; ILE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGCCCGGCATCACCCAGGCCTGC 

CCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 

AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTG 

CACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGG 

AGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTG 

AAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCAT 

CGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

CAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCC 

AGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATG 

CACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACC 

TGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 

GCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 

ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATC 

TTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGT 

GGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCG 

AGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCA 

TGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAG 

CAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTG 

GGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGC 

TGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCA 

GCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGC 

GAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGA 

GAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACA 

TCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGC 

GCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCT 

TCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 

GGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGA 

CGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGC 

CCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGC 

TGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 . PCT/US99/31272 

63 / , 65 

SEQ ED NO:24 VAL120-THR202; DLE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACrACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 



64 / ,65 



PCT/US99/31272 



SEQ ID NO:25 VAL127-ASN195 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTT 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGC 

AGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAAC 

ATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTT 

CCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGG 

TGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAG 

AAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATG 

GGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCA 

GCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

GCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTG 

CTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAG 

CTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCG 

AGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAG 

AAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACAT 

CAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCG 

CATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTT 

CCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCG 

GCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGAC 

GACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCC 

CGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCT 

GCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCG 

CCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 65 i 65 PCT/US99/31272 

SEQ ID NO:26 VAL127-ASN195; ARG426-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAQQQCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTT 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCGGCG 

GCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACC 

GGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCC 

CGGGGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAG 

ATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCG 

CGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGC 

CCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGA 

ACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATC 

AAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGG 

CATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGA 

GCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATC 

GACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAA 

CGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCA 

AGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCG 

TGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGA 

CCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAG 

CGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTG 

CGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATC 

GTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTA 

CTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGG 

CCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCC 

CCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 



SEQUENCE LISTING 



PCT/US99/31272 



<110> Chiron Corporation 

<12 0> MODIFIED HIV ENV POLYPEPTIDES 

<130> 1605.100 

<140> 
<141> 

<160> 26 

<170> Patentln Ver . 2.0 

<210> 1 
<211> 856 
<2\2> PRT 

<213> Human immunodeficiency virus 
<400> 1 

Met Arg Val Lys Glu Lys Tyr Gin His Leu Trp Arg Trp Gly Trp Arg 
15 10 15 

Trp Gly Thr Met Leu Leu Gly Met Leu Met He Cys Ser Ala Thr Glu 
20 25 30 

Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 
35 40 45 

Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu 
50 55 60 

Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 
65 70 75 80 

Pro Gin Glu Val Val Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 

85 90 95 

Lys Asn Asp Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp 
100 105 110 

Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Ser 
115 120 125 

Leu Lys Cys Thr Asp Leu Lys Asn Asp Thr Asn Thr Asn Ser Ser Ser 
130 135 140 

Gly Arg Met He Met Glu Lys Gly Glu He Lys Asn Cys Ser Phe Asn 
145 150 155 160 

He Ser Thr Ser He Arg Gly Lys Val Gin Lys Glu Tyr Ala Phe Phe 
165 170 175 

Tyr Lys Leu Asp He He Pro He Asp Asn Asp Thr Thr Ser Tyr Lys 
180 185 190 



1 



WO 00/39303 



PCT/US99/31272 



lie Leu Lys Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys Thr 
225 230 235 240 



Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val Val lie 
260 265 270 



Lys Arg lie Arg lie Gin Arg Gly Pro Gly Arg Ala Phe Val Thr lie 
305 310 315 320 



Lys Trp Asn Asn Thr Leu Lys Gin lie Ala Ser Lys Leu Arg Glu Gin 
340 345 350 



Ala Met Tyr Ala Pro Pro lie Ser Gly Gin lie Arg Cys Ser Ser Asn 



Ser Glu Leu Tyr Lys Tyr Lys Val Val Lys lie Glu Pro Leu Gly Val 



2 



WO 00/39303 
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Phe Asn lie Thr Asn Trp Leu Trp Tyr He Lys Leu Phe He Met He 
675 680 685 

Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser He 
690 695 700 

Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr His 
705 710 715 720 

Leu Pro Thr Pro Arg Gly Pro Asp Arg Pro Glu Gly lie Glu Glu Glu 
725 730 735 

Gly Gly Glu Arg Asp Arg Asp Arg Ser He Arg Leu Val Asn Gly Ser 
740 745 750 



Leu Gly Arg Arg Gly Trp Glu Ala Leu Lys Tyr Trp Trp Asn Leu Leu 

785 790 795 800 

Gin Tyr Trp Ser Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn 

805 810 815 

Ala Thr Ala He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val 

820 825 830 



3 



PCT/US99/31272 



Val Gin Gly Ala Cys Arg Ala He Arg His He Pro Arg Arg lie Arg 
835 840 845 



<210> 2 
<211> 847 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 2 

Met Arg Val Lys Gly He Arg Lys Asn Tyr Gin His Leu Trp Arg Gly 
15 10 15 

Gly Thr Leu Leu Leu Gly Met Leu Met He Cys Ser Ala Val Glu Lys 

20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 



Gin Glu He Val Leu Glu Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 



Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 



Lys Glu Met Asp Arg Gly Glu He Lys Asn Cys Ser Phe Lys Val Thr 
145 150 155 160 

Thr Ser He Arg Asn Lys Met Gin Lys Glu Tyr Ala Leu Phe Tyr Lys 
165 170 175 

Leu Asp Val Val Pro He Asp Asn Asp Asn Thr Ser Tyr Lys Leu He 



Asn Cys Asn Thr Ser Val He Thr Gin Ala Cys Pro Lys Val Ser Phe 
195 200 205 



Lys Cys Asn Asp Lys Lys Phe Asn Gly Ser Gly Pro Cys Thr Asn Val 
225 230 235 240 



WO 00/39303 



PCT/US99/31272 



Ser Thr Val Gin Cys Thr His Gly 
245 

Leu Leu Leu Asn Gly Ser Leu Ala 
260 

Glu Asn Phe Thr Asp Asn Ala Lys 
275 280 

Ser Val Glu lie Asn Cys Thr Arg 
290 295 



lie Arg Pro Val Val Ser Thr Gin 
250 255 

Glu Glu Gly Val Val He Arg Ser 
265 270 

Thr He He Val Gin Leu Lys Glu 
285 

Pro Asn Asn Asn Thr Arg Lys Ser 
300 



He Thr He Gly Pro Gly Arg Ala Phe Tyr Ala Thr Gly Asp He He 
305 310 315 320 

Gly Asp He Arg Gin Ala His Cys Asn He Ser Gly Glu Lys Trp Asn 

325 330 335 



Asn Thr Leu Lys Gin He Val Thr 
340 

Lys Thr He Val Phe Lys Gin Ser 
355 360 



Lys Leu Gin Ala Gin Phe Gly Asn 
345 350 

Ser Gly Gly Asp Pro Glu He Val 

365 



Met His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys Asn Ser Thr 
370 375 380 

Gin Leu Phe Asn Ser Thr Trp Asn Asn Thr He Gly Pro Asn Asn Thr 

385 390 395 400 



Asn Gly Thr He Thr Leu Pro Cys Arg He Lys Gin He He Asn Arg 
405 410 415 

Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro He Arg Gly Gin 
420 425 430 

He Arg Cys Ser Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly 
435 440 445 

Gly Lys Glu He Ser Asn Thr Thr Glu He Phe Arg Pro Gly Gly Gly 
450 455 460 

Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val 
465 470 475 480 

Lys He Glu Pro Leu Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val 
485 490 495 

Val Gin Arg Glu Lys Arg Ala Val Thr Leu Gly Ala Met Phe Leu Gly 
500 505 510 

Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Arg Ser Leu Thr Leu 
515 520 525 

Thr Val Gin Ala Arg Gin Leu Leu Ser Gly He Val Gin Gin Gin Asn 
530 535 540 

Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu Leu Gin Leu Thr 
545 550 555 560 
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Tyr Leu Lys Asp Gin Gin Leu Leu Gly lie Trp Gly Cys Ser Gly Lys 
580 585 590 

Leu lie Cys Thr Thr Ala Val Pro Trp Asn Ala Ser Trp Ser Asn Lys 



Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys 
645 650 655 

Trp Ala Ser Leu Trp Asn Trp Phe Asp lie Ser Lys Trp Leu Trp Tyr 
660 665 670 

He Lys He Phe He Met He Val Gly Gly Leu Val Gly Leu Arg He 
675 680 685 

Val Phe Thr Val Leu Ser He Val Asn Arg Val Arg Gin Gly Tyr Ser 
690 695 700 



Ala Ala Arg He Val Glu Leu Leu Gly Arg Arg Gly Trp Glu Ala Leu 
770 775 780 

Lys Tyr Trp Gly Asn Leu Leu Gin Tyr Trp He Gin Glu Leu Lys Asn 
785 790 795 800 

Ser Ala Val Ser Leu Phe Asp Ala He Ala He Ala Val Ala Glu Gly 



<210> 3 

<211> 2310 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Vall20-Ala204 
<400> 3 



gaattcgcca 


ccatggatgc 


aatgaagaga 


gggctctgcr 


gtgtgctgct 


gctgtgtgga 


60 


gcagtcttcg 


tttcgcccag 


cgccgtggag 


aagctgtggg 


tgaccgtgta 


ctacggcgtg 


120 


cccgtgtgga 


aggaggccac 


caccaccctg 


ttctgcgcca 


gcgacgccaa 


ggcctacgac 


180 


accgaggtgc 


acaacgtgtg 


ggccacccac 


gcctgcgtgc 


ccaccgaccc 


caacccccag 


240 


gagatcgtgc 


tggagaacgt 


gaccgagaac 


ttcaacatgt 


ggaagaacaa 


catggtggag 


300 


cagatgcacg 


aggacatcat 


cagcctgtgg 


gaccagagcc 


tgaagccctg 


cgtgggcgcc 


360 


ggcgcctgcc 


ccaaggtgag 


cttcgagccc 


atccccatcc 


actactgcgc 


ccccgccggc 


420 


ttcgccatcc 


tgaagtgcaa 


cgacaagaag 


ttcaacggca 


gcggcccctg 


caccaacgtg 


480 


agcaccgtgc 


agtgcaccca 


cggcatccgc 


cccgtggtga 


gcacccagct 


gctgctgaac 


540 


ggcagcctgg 


ccgaggaggg 


cgtggtgatc 


cgcagcgaga 


acttcaccga 


caacgccaag 


600 


accatcatcg 


tgcagctgaa 


ggagagcgtg 


gagatcaact 


gcacccgccc 


caacaacaac 


660 


acccgcaaga 


gcatcaccat 


cggccccggc 


cgcgccttct 


acgccaccgg 


cgacatcatc 


720 


ggcgacatcc 


gccaggccca 


ctgcaacatc 


agcggcgaga 


agtggaacaa 


caccctgaag 


780 


cagatcgtga 


ccaagctgca 


ggcccagttc 


ggcaacaaga 


ccatcgtgtt 


caagcagagc 


840 


agcggcggcg 


accccgagat 


cgtgatgcac 


agcttcaact 


gcggcggcga 


gttcttctac 


900 


tgcaacagca 


cccagctgtt 


caacagcacc 


tggaacaaca 


ccatcggccc 


caacaacacc 


960 


aacggcacca 


tcaccctgcc 


ctgccgcatc 


aagcagatca 


tcaaccgctg 


gcaggaggtg 


1020 


ggcaaggcca 


tgtacgcccc 


ccccatccgc 


ggccagatcc 


gctgcagcag 


caacatcacc 


1080 


ggcctgctgc 


tgacccgcga 


cggcggcaag 


gagatcagca 


acaccaccga 


gatcttccgc 


114 0 


cccggcggcg 


gcgacatgcg 


cgacaactgg 


cgcagcgagc 


tgtacaagta 


caaggtggtg 


1200 


aagatcgagc 


ccctgggcgt 


ggcccccacc 


aaggccaagc 


gccgcgtggt 


gcagcgcgag 


1260 


aagcgcgccg 


tgaccctggg 


cgccatgttc 


ctgggcttcc 


tgggcgccgc 


cggcagcacc 


1320 


atgggcgccc 


gcagcctgac 


cctgaccgtg 


caggcccgcc 


agctgctgag 


cggcatcgtg 


1380 


cagcagcaga 


acaacctgct 


gcgcgccatc 


gaggcccagc 


agcacctgct 


gcagctgacc 


1440 


gtgtggggca 


tcaagcagct 


gcaggcccgc 


gtgc-ggccg 


tggagcgcta 


cctgaaggac 


1500 


cagcagctgc 


tgggcatctg 


gggctgcagc 


ggcaagctga 


tctgcaccac 


cgccgtgccc 


1560 


tggaacgcca 


gctggagcaa 


caagagcctg 


gaccagatct 


ggaacaacat 


gacctggatg 


1620 


gagtgggagc 


gcgagatcga 




aacctgatct 


acaccctgat 


cgaggagagc 




cagaaccagc 


aggagaagaa 


cgagcaggag 


ctgctggagc 


tggacaagtg 


ggccagcctg 


1740 


tggaactggt 


tcgacatcag 


caagtggctg 


tggtacatca 


agatcttcat 


catgatcgtg 


1800 


ggcggcctgg 


tgggcctgcg 


catcgtgttc 


accgtgctga 


gcatcgtgaa 


ccgcgtgcgc 


1860 


cagggctaca 


gccccctgag 




cgcttccccg 


ccccccgcgg 


ccccgaccgc 


1920 


cccgagggca 


tcgaggagga 


gggcggcgag 


cgcgaccgcg 


accgcagcag 


ccccctggtg 


1980 


cacggcctgc 


tggccctgat 


ctgggacgac 


ctgcgcagcc 


tgtgcctgtt 


cagctaccac 


2040 


cgcctgcgcg 


acctgatcct 


gatcgccgcc 


cgcatcgtgg 


agctgctggg 


ccgccgcggc 


2100 


tgggaggccc 


tgaagtactg 


gggcaacctg 


ctgcagtact 


ggatccagga 


gctgaagaac 


2160 


agcgccgtga 


gcctgttcga 


cgccatcgcc 


atcgccgtgg 


ccgagggcac 


cgaccgcatc 


2220 


atcgaggtgg 


cccagcgcat 


cggccgcgcc 


ttcctgcaca 


tcccccgccg 


catccgccag 


2280 


ggcttcgagc 


gcgccctgct 


gtaactcgag 








2310 



<210> 4 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Ile201 
<400> 4 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
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atcacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
gaggtgggca aggccatgta cgcccccccc 
atcaccggcc tgctgctgac ccgcgacggc 
ttccgccccg gcggcggcga catgcgcgac 
gtggtgaaga tcgagcccct gggcgtggcc 
cgcgagaagc gcgccgtgac cctgggcgcc 
agcaccatgg gcgcccgcag cctgaccctg 
atcgtgcagc agcagaacaa cctgctgcgc 
ctgaccgtgt ggggcatcaa gcagctgcag 
aaggaccagc agctgctggg catctggggc 
gtgccctgga acgccagctg gagcaacaag 
tggatggagt gggagcgcga gatcgacaac 
gagagccaga accagcagga gaagaacgag 
agcctgtgga actggttcga catcagcaag 
atcgtgggcg gcctggtggg cctgcgcatc 
gtgcgccagg gctacagccc cctgagcttc 
gaccgccccg agggcatcga ggaggagggc 
ctggtgcacg gcctgctggc cctgatctgg 
taccaccgcc tgcgcgacct gatcctgatc 
cgcggctggg aggccctgaa gtactggggc 
aagaacagcg ccgtgagcct gttcgacgcc 
cgcatcatcg aggtggccca gcgcatcggc 
cgccagggct tcgagcgcgc cctgctgtaa 



gagcccatcc 


ccatccacta 


ctgcgccccc 


420 


aagaagttca 


acggcagcgg 


cccctgcacc 


480 


atccgccccg 


tggtgagcac 


ccagctgctg 


540 


gtgatccgca 


gcgagaactt 


caccgacaac 


SOO 


agcgtggaga 


tcaactgcac 


ccgccccaac 


660 


cccggccgcg 


ccttctacgc 


caccggcgac 


720 


aacatcagcg 


gcgagaagtg 


gaacaacacc 


780 


cagttcggca 


acaagaccat 


cgtgttcaag 


840 


atgcacagct 


tcaactgcgg 


cggcgagttc 


900 


agcacctgga 


acaacaccat 


cggccccaac 


960 


cgcatcaagc 


agatcatcaa 


ccgctggcag 


1020 


atccgcggcc 


agatccgctg 


cagcagcaac 


1080 


ggcaaggaga 


tcagcaacac 


caccgagatc 


1140 


aactggcgca 


gcgagctgta 


caagtacaag 


1200 


cccaccaagg 


ccaagcgccg 


cgtggtgcag 


1260 


atgttcctgg 


gcttcctggg 


cgccgccggc 


1320 


accgtgcagg 


cccgccagct 


gctgagcggc 


1380 


gccatcgagg 


cccagcagca 


cctgctgcag 


1440 


gcccgcgtgc 


tggccgtgga gcgctacctg 


1500 


tgcagcggca 


agctgatctg 


caccaccgcc 


1560 


agcctggacc 


agatctggaa 


caacatgacc 


1620 


tacaccaacc 


tgatctacac 


cctgatcgag 


1680 


caggagctgc 


tggagctgga 


caagtgggcc 


1740 


tggctgtggt 


acatcaagat 


cttcatcatg 


1800 


gtgttcaccg 


tgctgagcat 


cgtgaaccgc 


1860 


cagacccgct 


tccccgcccc 


ccgcggcccc 




ggcgagcgcg 


accgcgaccg 


cagcagcccc 


1980 


gacgacctgc 


gcagcctgtg 


cctgttcagc 


2040 


gccgcccgca 


tcgtggagct 


gctgggccgc 


2100 


aacctgctgc 


agtactggat 


ccaggagctg 


2160 


atcgccatcg 


ccgtggccga 


gggcaccgac 


2220 


cgcgccttcc 


tgcacatccc 


ccgccgcatc 


2280 


ctcgag 






2316 



<210> 5 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall20-Ile201B 



<400> 5 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgcccggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 72 0 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
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aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 114 0 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 12 60 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 162 0 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 174 0 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgagcgtg ct 2322 

<210> 6 
<211> 2328 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Lysl21-Val200 
<400> 6 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaaggcc 360 
cccgtgatca cccaggcctg ccccaaggtg agcttcgagc ccatccccat ccactactgc 420 
gcccccgccg gcttcgccat cctgaagtgc aacgacaaga agttcaacgg cagcggcccc 480 
tgcaccaacg tgagcaccgt gcagtgcacc cacggcatcc gccccgtggt gagcacccag 54 0 
ctgctgctga acggcagcct ggccgaggag ggcgtggtga tccgcagcga gaacttcacc 600 
gacaacgcca agaccatcat cgtgcagctg aaggagagcg tggagatcaa ctgcacccgc 660 
cccaacaaca acacccgcaa gagcatcacc atcggccccg gccgcgcctt ctacgccacc 720 
ggcgacatca tcggcgacat ccgccaggcc cactgcaaca tcagcggcga gaagtggaac 780 
aacaccctga agcagatcgt gaccaagctg caggcccagt tcggcaacaa gaccatcgtg 840 
ttcaagcaga gcagcggcgg cgaccccgag atcgtgatgc acagcttcaa ctgcggcggc 900 
gagttcttct actgcaacag cacccagctg ttcaacagca cctggaacaa caccatcggc 960 
cccaacaaca ccaacggcac catcaccctg ccctgccgca tcaagcagat catcaaccgc 1020 
tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 12 00 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 12 6 0 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 13 8 0 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
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accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 

atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 

atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 

tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 

atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 

aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 

ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 

agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 

ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 

ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 

gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 

accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 

cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg agcgtgct 2328 



<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 
<400> 7 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 42 0 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 84 0 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggc aggaggtggg caaggccatg tacgcccccc ccatccgcgg ccagatccgc 1080 
tgcagcagca acatcaccgg cctgctgctg acccgcgacg gcggcaagga gatcagcaac 114 0 
accaccgaga tcttccgccc cggcggcggc gacatgcgcg acaactggcg cagcgagctg 1200 
tacaagtaca aggtggtgaa gatcgagccc ctgggcgtgg cccccaccaa ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg accctgggcg ccatgttcct gggcttcctg 132 0 
ggcgccgccg gcagcaccat gggcgcccgc agcctgaccc tgaccgtgca ggcccgccag 1380 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcgt gctggccgtg 1500 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccaccg ccgtgccctg gaacgccagc tggagcaaca agagcctgga ccagatctgg 162 0 
aacaacatga cctggatgga gtgggagcgc gagatcgaca actacaccaa cctgatctac 1680 
accctgatcg aggagagcca gaaccagcag gagaagaacg agcaggagct gctggagctg 174 0 
gacaagtggg ccagcctgtg gaactggttc gacatcagca agtggctgtg gtacatcaag 1800 
atcttcatca tgatcgtggg cggcctggtg ggcctgcgca tcgtgttcac cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccctgagct tccagacccg cttccccgcc 1920 
ccccgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 1980 
cgcagcagcc ccctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 2040 
tgcctgttca gctaccaccg cctgcgcgac ctgatcctga tcgccgcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggg gcaacctgct gcagtactgg 2160 
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atccaggagc tgaagaacag cgccgtgagc ctgttcgacg ccatcgccat cgccgtggcc 222 0 
gagggcaccg accgcatcat cgaggtggcc cagcgcatcg gccgcgcctt cctgcacatc 22 80 
ccccgccgca tccgccaggg cttcgagcgc gccctgctgt aactcgagcg tgct 2334 

<210> 8 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 6 0 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 

<210> 9 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Trp427 -Gly43 1 
<400> 9 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctgggg cggcaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 144 0 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1S2 0 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 192 0 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 198 0 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 222 0 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 22 8 0 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 10 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426-Gly431 
<400> 10 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
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accgaggtgc acaacgtgtg 
gagatcgtgc tggagaacgt 
cagatgcacg aggacatcat 
acccccctgt gcgtgaccct 
agcaactgga aggagatgga 
agcatccgca acaagatgca 
atcgacaacg acaacaccag 
gcctgcccca aggtgagctt 
gccatcctga agtgcaacga 
accgtgcagt gcacccacgg 
agcctggccg aggagggcgt 
atcatcgtgc agctgaagga 
cgcaagagca tcaccatcgg 
gacatccgcc aggcccactg 
atcgtgacca agctgcaggc 
ggcggcgacc ccgagatcgt 
aacagcaccc agctgttcaa 
ggcaccatca ccctgccctg 
atgtacgccc cccccatccg 
ctgacccgcg acggcggcaa 
ggcgacatgc gcgacaactg 
cccctgggcg tggcccccac 
gtgaccctgg gcgccatgtt 
cgcagcctga ccctgaccgt 
aacaacctgc tgcgcgccat 
atcaagcagc tgcaggcccg 
ctgggcatct ggggctgcag 
agctggagca acaagagcct 
cgcgagatcg acaactacac 
caggagaaga acgagcagga 
ttcgacatca gcaagtggct 
gtgggcctgc gcatcgtgtt 
agccccctga gcttccagac 
atcgaggagg agggcggcga 
ctggccctga tctgggacga 
gacctgatcc tgatcgccgc 
ctgaagtact ggggcaacct 
agcctgttcg acgccatcgc 
gcccagcgca tcggccgcgc 
cgcgccctgc tgtaactcga 



ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
gcactgcacc aacctgaaga 
ccgcggcgag atcaagaact 
gaaggagtac gccctgttct 
ctacaagctg atcaactgca 

caagaagttc aacggcagcg 
catccgcccc gtggtgagca 
ggtgatccgc agcgagaact 
gagcgtggag atcaactgca 
ccccggccgc gccttctacg 
caacatcagc ggcgagaagt 
ccagttcggc aacaagacca 
gatgcacagc ttcaactgcg 
cagcacctgg aacaacacca 
ccgcatcaag cagatcatca 
cggccagatc cgctgcagca 
ggagatcagc aacaccaccg 
gcgcagcgag ctgtacaagt 
caaggccaag cgccgcgtgg 
cctgggcttc ctgggcgccg 
gcaggcccgc cagctgctga 
cgaggcccag cagcacctgc 
cgtgctggcc gtggagcgct 
cggcaagctg atctgcacca 
ggaccagatc tggaacaaca 
caacctgatc tacaccctga 
gctgctggag ctggacaagt 
gtggtacatc aagatcttca 
caccgtgctg agcatcgtga 
ccgcttcccc gccccccgcg 
gcgcgaccgc gaccgcagca 
cctgcgcagc ctgtgcctgt 
ccgcatcgtg gagctgctgg 
gctgcagtac tggatccagg 
catcgccgtg gccgagggca 
cttcctgcac atcccccgcc 

g 



ccaccgaccc 


caacccccag 


240 


ggaagaacaa 


catggtggag 


300 


tgaagccctg 


cgtgaagctg 


360 


acgccaccaa 


caccaagagc 


420 


gcagcttcaa 


ggtgaccacc 


480 


acaagctgga 


cgtggtgccc 


540 


acaccagcgt 


gatcacccag 


600 


actgcgcccc 


cgccggcttc 


660 


gcccctgcac 


caacgtgagc 


720 


cccagctgct 


gctgaacggc 


780 


tcaccgacaa 


cgccaagacc 


840 


cccgccccaa 


caacaacacc 


900 


ccaccggcga 


catcatcggc 


960 


ggaacaacac 


cctgaagcag 


1020 


tcgtgttcaa 


gcagagcagc 


1080 


gcggcgagtt 


cttctactgc 


1140 


tcggccccaa 


caacaccaac 


1200 


accgcggcgg 


cggcaaggcc 


1260 


gcaacatcac 


cggcctgctg 


1320 


agatcttccg 


ccccggcggc 


1380 


acaaggtggt 


gaagatcgag 


1440 


tgcagcgcga 


gaagcgcgcc 


1500 


ccggcagcac 


catgggcgcc 


1560 


gcggcatcgt 


gcagcagcag 


1620 


tgcagctgac 


cgtgtggggc 


1680 


acctgaagga 


ccagcagctg 


1740 


ccgccgtgcc 


ctggaacgcc 


1800 


tgacctggat 


ggagtgggag 


1860 


tcgaggagag 


ccagaaccag 


1920 


gggccagcct 


gtggaactgg 


1980 


tcatgatcgt 


gggcggcctg 


2040 


accgcgtgcg 


ccagggctac 


2100 


gccccgaccg 


ccccgagggc 


2160 


gccccctggt 


gcacggcctg 


2220 




ccgcctgcgc 


2280 


gccgccgcgg 


ctgggaggcc 


2340 


agctgaagaa 


cagcgccgtg 


2400 


ccgaccgcat 


catcgaggtg 


2460 


gcatccgcca 


gggcttcgag 


2520 
2541 



<210> 11 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Arg426-Gly431B 
<400> 11 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 



13 



WO 00/39303 



PCT/US99/31272 



atcgacaacg acaacaccag ctacaagctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaacga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggagggcgt ggtgatccgc 
atcatcgtgc agctgaagga gagcgtggag 
cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
atgtacgccc cccccatccg cggccagatc 
ctgacccgcg acggcggcaa ggagatcagc 
ggcgacatgc gcgacaactg gcgcagcgag 
cccctgggcg tggcccccac caaggccaag 
gtgaccctgg gcgccatgtt cctgggcttc 
cgcagcctga ccctgaccgt gcaggcccgc 
aacaacctgc tgcgcgccat cgaggcccag 
atcaagcagc tgcaggcccg cgtgctggcc 
ctgggcatct ggggctgcag cggcaagctg 
agctggagca acaagagcct ggaccagatc 
cgcgagatcg acaactacac caacctgatc 
caggagaaga acgagcagga gctgctggag 
ttcgacatca gcaagtggct gtggtacatc 
gtgggcctgc gcatcgtgtt caccgtgctg 
agccccctga gcttccagac ccgcttcccc 
atcgaggagg agggcggcga gcgcgaccgc 
ctggccctga tctgggacga cctgcgcagc 
gacctgatcc tgatcgccgc ccgcatcgtg 
ctgaagtact ggggcaacct gctgcagtac 
agcctgttcg acgccatcgc catcgccgtg 
gcccagcgca tcggccgcgc cttcctgcac 
cgcgccctgc tgtaactcga g 



atcaactgca 


acaccagcgt 


gatcacccag 


600 


cccatccact 


actgcgcccc 


cgccggcttc 


660 


aacggcagcg 


gcccctgcac 


caacgtgagc 


720 


gtggtgagca 


cccagctgct 


gctgaacggc 


780 


agcgagaact 


tcaccgacaa 


cgccaagacc 


840 


atcaactgca 


cccgccccaa 


caacaacacc 


900 


gccttctacg 


ccaccggcga 


catcatcggc 


960 


ggcgagaagt 


ggaacaacac 


cctgaagcag 


1020 


aacaagacca 


tcgtgttcaa 


gcagagcagc 


1080 


ttcaactgcg 


gcggcgagtt 


cttctactgc 


1140 


aacaacacca 


tcggccccaa 


caacaccaac 


1200 


cagatcatca 


accgcggcag 


cggcaaggcc 


1260 


cgctgcagca 


gcaacatcac 


cggcctgctg 


1320 


aacaccaccg 


agatcttccg 


ccccggcggc 


1380 


ctgtacaagt 


acaaggtggt 


gaagatcgag 


1440 


cgccgcgtgg 


tgcagcgcga 


gaagcgcgcc 


1500 


ctgggcgccg 


ccggcagcac 


catgggcgcc 


1560 


cagctgctga 


gcggcatcgt 


gcagcagcag 


1620 


cagcacctgc 


tgcagctgac 


cgtgtggggc 


1680 


gtggagcgct 


acctgaagga 


ccagcagctg 


1740 


atctgcacca 


ccgccgtgcc 


ctggaacgcc 


1800 


tggaacaaca 


tgacctggat 


ggagtgggag 


1860 


tacaccctga 


tcgaggagag 


ccagaaccag 


1920 


ctggacaagt 


gggccagcct 


gtggaactgg 


1980 


aagatcttca 




gggcggcctg 




agcatcgtga 


accgcgtgcg 


ccagggctac 


2100 


gccccccgcg 


gccccgaccg 


ccccgagggc 


2160 


gaccgcagca 


gccccctggt 


gcacggcctg 


2220 


ctgtgcctgt 


tcagctacca 


ccgcctgcgc 


2280 


gagctgctgg 


gccgccgcgg 


ctgggaggcc 


2340 


tggatccagg 


agctgaagaa 


cagcgccgtg 


2400 


gccgagggca 


ccgaccgcat 


catcgaggtg 


2460 


atcccccgcc 


gcatccgcca 


gggcttcgag 


2520 
2541 



<210> 12 

<211> 2541 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Arg426-Lys432 
<400> 12 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 18 0 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
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cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
atgtacgccc cccccatccg cggccagatc 
ctgacccgcg acggcggcaa ggagatcagc 
ggcgacatgc gcgacaactg gcgcagcgag 
cccctgggcg tggcccccac caaggccaag 
gtgaccctgg gcgccatgtt cctgggcttc 
cgcagcctga ccctgaccgt gcaggcccgc 
aacaacctgc tgcgcgccat cgaggcccag 
atcaagcagc tgcaggcccg cgtgctggcc 
ctgggcatct ggggctgcag cggcaagctg 
agctggagca acaagagcct ggaccagatc 
cgcgagatcg acaactacac caacctgatc 
caggagaaga acgagcagga gctgctggag 
ttcgacatca gcaagtggct gtggtacatc 
gtgggcctgc gcatcgtgtt caccgtgctg 
agccccctga gcttccagac ccgcttcccc 
atcgaggagg agggcggcga gcgcgaccgc 
ctggccctga tctgggacga cctgcgcagc 
gacctgatcc tgatcgccgc ccgcatcgtg 
ctgaagtact ggggcaacct gctgcagtac 
agcctgttcg acgccatcgc catcgccgtg 
gcccagcgca tcggccgcgc cttcctgcac 
cgcgccctgc tgtaactcga g 



gccttctacg ccaccggcga catcatcggc 960 
ggcgagaagt ggaacaacac cctgaagcag 1020 
aacaagacca tcgtgttcaa gcagagcagc 1080 
ttcaactgcg gcggcgagtt cttctactgc 1140 
aacaacacca tcggccccaa caacaccaac 1200 
cagatcatca accgcggcgg caacaaggcc 1260 
cgctgcagca gcaacatcac cggcctgctg 1320 
aacaccaccg agatcttccg ccccggcggc 1380 
ctgtacaagt acaaggtggt gaagatcgag 144 0 
cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
ctgggcgccg ccggcagcac catgggcgcc 1560 
cagctgctga gcggcatcgt gcagcagcag 1620 
cagcacctgc tgcagctgac cgtgtggggc 1680 
gtggagcgct acctgaagga ccagcagctg 174 0 
atctgcacca ccgccgtgcc ctggaacgcc 1800 
tggaacaaca tgacctggat ggagtgggag 1860 
tacaccctga tcgaggagag ccagaaccag 192 0 
ctggacaagt gggccagcct gtggaactgg 1980 
aagatcttca tcatgatcgt gggcggcctg 2040 
agcatcgtga accgcgtgcg ccagggctac 2100 
gccccccgcg gccccgaccg ccccgagggc 216 0 
gaccgcagca gccccctggt gcacggcctg 2220 
ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gagctgctgg gccgccgcgg ctgggaggcc 2340 
tggatccagg agctgaagaa cagcgccgtg 2400 
gcegagggca ccgaccgcat catcgaggtg 2460 
atcccccgcc gcatccgcca gggcttcgag 252 0 
2541 



<210> 13 

<211> 2535 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Asn425-Lys432 



<400> 13 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca acgcccccaa ggccatgtac 1260 
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gcccccccca tccgcggcca gatccgctgc agcagcaaca tcaccggcct gctgctgacc 1320 
cgcgacggcg gcaaggagat cagcaacacc accgagatct tccgccccgg cggcggcgac 13 80 
atgcgcgaca actggcgcag cgagctgtac aagtacaagg tggtgaagat cgagcccctg 144 0 
ggcgtggccc ccaccaaggc caagcgccgc gtggtgcagc gcgagaagcg cgccgtgacc 15 0 0 
ctgggcgcca tgttcctggg cttcctgggc gccgccggca gcaccatggg cgcccgcagc 1560 
ctgaccctga ccgtgcaggc ccgccagctg ctgagcggca tcgtgcagca gcagaacaac 162 0 
ctgctgcgcg ccatcgaggc ccagcagcac ctgctgcagc tgaccgtgtg gggcatcaag 1680 
cagctgcagg cccgcgtgct ggccgtggag cgctacctga aggaccagca gctgctgggc 174 0 
atctggggct gcagcggcaa gctgatctgc accaccgccg tgccctggaa cgccagctgg 1800 
agcaacaaga gcctggacca gatctggaac aacatgacct ggatggagtg ggagcgcgag 1860 
atcgacaact acaccaacct gatctacacc ctgatcgagg agagccagaa ccagcaggag 192 0 
aagaacgagc aggagctgct ggagctggac aagtgggcca gcctgtggaa ctggttcgac 1980 
atcagcaagt ggctgtggta catcaagatc ttcatcatga tcgtgggcgg cctggtgggc 2 04 0 
ctgcgcatcg tgttcaccgt gctgagcatc gtgaaccgcg tgcgccaggg ctacagcccc 2100 
ctgagcttcc agacccgctt ccccgccccc cgcggccccg accgccccga gggcatcgag 2160 
gaggagggcg gcgagcgcga ccgcgaccgc agcagccccc tggtgcacgg cctgctggcc 222 0 
ctgatctggg acgacctgcg cagcctgtgc ctgttcagct accaccgcct gcgcgacctg 2280 
atcctgatcg ccgcccgcat cgtggagctg ctgggccgcc gcggctggga ggccctgaag 2340 
tactggggca acctgctgca gtactggatc caggagctga agaacagcgc cgtgagcctg 2400 
ttcgacgcca tcgccatcgc cgtggccgag ggcaccgacc gcatcatcga ggtggcccag 2460 
cgcatcggcc gcgccttcct gcacatcccc cgccgcatcc gccagggctt cgagcgcgcc 2520 
ctgctgtaac tcgag 2535 



<210> 14 

<211> 2529 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Ile424-Ala433 



<400> 14 














gaattcgcca 


ccatggatgc 


aatgaagaga 


gggctctgct 


gtgtgctgct 


gctgtgtgga 


60 


gcagtcttcg 


tttcgcccag 


cgccgtggag 


aagctgtggg 


tgaccgtgta 


ctacggcgtg 


120 


cccgtgtgga 


aggaggccac 


caccaccctg 


ttctgcgcca 


gcgacgccaa 


ggcctacgac 


180 


accgaggtgc 


acaacgtgtg 


ggccacccac 


gcctgcgtgc 


ccaccgaccc 


caacccccag 


240 


gagatcgtgc 


tggagaacgt 


gaccgagaac 




ggaagaacaa 


catggtggag 


300 


cagatgcacg 


aggacatcat 


cagcctgtgg 


gaccagagcc 


tgaagccctg 


cgtgaagctg 


360 


acccccctgt 


gcgtgaccct 


gcactgcacc 


aacctgaaga 


acgccaccaa 


caccaagagc 


420 


agcaactgga 


aggagatgga 


ccgcggcgag 


atcaagaact 




ggtgaccacc 


480 


agcatccgca 


acaagatgca 


gaaggagtac 


gccctgttct 


acaagctgga 


cgtggtgccc 


540 


atcgacaacg 


acaacaccag 


ctacaagctg 


atcaactgca 


acaccagcgt 


gatcacccag 


600 


gcctgcccca 


aggtgagctt 


cgagcccatc 




actgcgcccc 


cgccggcttc 


660 


gccatcctga 


agtgcaacga 


caagaagttc 


aacggcagcg 


gcccctgcac 


caacgtgagc 


720 


accgtgcagt 


gcacccacgg 




gtggtgagca 


cccagctgct 


gctgaacggc 


780 


agcctggccg 


aggagggcgt 


ggtgatccgc 


agcgagaact 


tcaccgacaa 


cgccaagacc 


840 


atcatcgtgc 


agctgaagga 


gagcgtggag 


atcaactgca 


cccgccccaa 


caacaacacc 


900 


cgcaagagca 


tcaccatcgg 


ccccggccgc 


gccttctacg 


ccaccggcga 


catcatcggc 


960 


gacatccgcc 


aggcccactg 


caacatcagc 


ggcgagaagt 




cctgaagcag 


1020 


atcgtgacca 


agctgcaggc 


ccagttcggc 


aacaagacca 




gcagagcagc 


1080 


ggcggcgacc 


ccgagatcgt 


gatgcacagc 


ttcaactgcg 


gcggcgagtt 


cttctactgc 


1140 


aacagcaccc 




cagcacctgg 


aacaacacca 


tcggccccaa 


caacaccaac 


1200 


ggcaccatca 


ccctgccctg 


ccgcatcaag 


cagatcatcg 


gcggcgccat 


gtacgccccc 


1260 


cccatccgcg 


gccagatccg 


ctgcagcagc 


aacatcaccg 


gcctgctgct 


gacccgcgac 


1320 


ggcggcaagg 


agatcagcaa 


caccaccgag 


atcttccgcc 


ccggcggcgg 


cgacatgcgc 


1380 


gacaactggc 


gcagcgagct 


gtacaagtac 


aaggtggrga 


agatcgagcc 


cctgggcgtg 


1440 


gcccccacca 


aggccaagcg 


ccgcgtggtg 


cagcgcgaga 


agcgcgccgt 


gaccctgggc 


1500 


gccatgttcc 


tgggcttcct 


gggcgccgcc 


ggcagcacca 


tgggcgcccg 


cagcctgacc 


1560 


ctgaccgtgc 


aggcccgcca 


gctgctgagc 


ggcatcgtgc 


agcagcagaa 


caacctgctg 


1620 
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cgcgccatcg aggcccagca gcacctgctg cagctgaccg tgtggggcat caagcagctg 1680 
caggcccgcg tgctggccgt ggagcgctac ctgaaggacc agcagctgct gggcatctgg 174 0 
ggctgcagcg gcaagctgat ctgcaccacc gccgtgccct ggaacgccag ctggagcaac 1800 
aagagcctgg accagatctg gaacaacatg acctggatgg agtgggagcg cgagatcgac 1860 
aactacacca acctgatcta caccctgatc gaggagagcc agaaccagca ggagaagaac 1920 
gagcaggagc tgctggagct ggacaagtgg gccagcctgt ggaactggtt cgacatcagc 1980 
aagtggctgt ggtacatcaa gatcttcatc atgatcgtgg gcggcctggt gggcctgcgc 2040 
atcgtgttca ccgtgctgag catcgtgaac cgcgtgcgcc agggctacag ccccctgagc 2100 
ttccagaccc gcttccccgc cccccgcggc cccgaccgcc ccgagggcat cgaggaggag 2160 
ggcggcgagc gcgaccgcga ccgcagcagc cccctggtgc acggcctgct ggccctgatc 222 0 
tgggacgacc tgcgcagcct gtgcctgttc agctaccacc gcctgcgcga cctgatcctg 2280 
atcgccgccc gcatcgtgga gctgctgggc cgccgcggct gggaggccct gaagtactgg 2340 
ggcaacctgc tgcagtactg gatccaggag ctgaagaaca gcgccgtgag cctgttcgac 2400 
gccatcgcca tcgccgtggc cgagggcacc gaccgcatca tcgaggtggc ccagcgcatc 2460 
ggccgcgcct tcctgcacat cccccgccgc atccgccagg gcttcgagcg cgccctgctg 2520 
taactcgag 2529 

<210> 15 
<211> 2523 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile423 -Met434 
<400> 15 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcggcg gcatgtacgc cccccccatc 1260 
cgcggccaga tccgctgcag cagcaacatc accggcctgc tgctgacccg cgacggcggc 1320 
aaggagatca gcaacaccac cgagatcttc cgccccggcg gcggcgacat gcgcgacaac 1380 
tggcgcagcg agctgtacaa gtacaaggtg gtgaagatcg agcccctggg cgtggccccc 144 0 
accaaggcca agcgccgcgt ggtgcagcgc gagaagcgcg ccgtgaccct gggcgccatg 15 0 0 
ttcctgggct tcctgggcgc cgccggcagc accatgggcg cccgcagcct gaccctgacc 1560 
gtgcaggccc gccagctgct gagcggcatc gtgcagcagc agaacaacct gctgcgcgcc 162 0 
atcgaggccc agcagcacct gctgcagctg accgtgtggg gcatcaagca gctgcaggcc 16 8 0 
cgcgtgctgg ccgtggagcg ctacctgaag gaccagcagc tgctgggcat ctggggctgc 174 0 
agcggcaagc tgatctgcac caccgccgtg ccctggaacg ccagctggag caacaagagc 1800 
ctggaccaga tctggaacaa catgacctgg atggagtggg agcgcgagat cgacaactac 1860 
accaacctga tctacaccct gatcgaggag agccagaacc agcaggagaa gaacgagcag 192 0 
gagctgctgg agctggacaa gtgggccagc ctgtggaact ggttcgacat cagcaagtgg 1980 
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ctgtggtaca tcaagatctt catcatgatc 
ttcaccgtgc tgagcatcgt gaaccgcgtg 
acccgcttcc ccgccccccg cggccccgac 
gagcgcgacc gcgaccgcag cagccccctg 
gacctgcgca gcctgtgcct gttcagctac 
gcccgcatcg tggagctgct gggccgccgc 
ctgctgcagt actggatcca ggagctgaag 
gccatcgccg tggccgaggg caccgaccgc 
gccttcctgc acatcccccg ccgcatccgc 
gag 



gtgggcggcc tggtgggcct gcgcatcgtg 2040 
cgccagggct acagccccct gagcttccag 2100 
cgccccgagg gcatcgagga ggagggcggc 2160 
gtgcacggcc tgctggccct gatctgggac 222 0 
caccgcctgc gcgacctgat cctgatcgcc 2280 
ggctgggagg ccctgaagta ctggggcaac 234 0 
aacagcgccg tgagcctgtt cgacgccatc 2400 
atcatcgagg tggcccagcg catcggccgc 2460 
cagggcttcg agcgcgccct gctgtaactc 2520 
2523 



<210> 16 

<211> 2517 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435 



<400> 16 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagggcggct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 13 8 0 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 162 0 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 174 0 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 192 0 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 198 0 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
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cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gagg-ggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 17 
<211> 2517 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435B 

<400> 17 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag caggccccct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 1380 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 162 0 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 174 0 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 1920 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 24 00 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 18 
<211> 2322 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Leul22-Serl99 ,- 
Arg42 6-Gly431 



<400> 18 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 6 0 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 54 0 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 7B0 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgcggcg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 114 0 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 13 80 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 144 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 192 0 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 198 0 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 222 0 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 



<210> 19 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Leul22-Serl99 ; 
Arg426-Lys432 



<400> 19 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 6 0 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
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cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
ggcaacagcg tgatcaccca ggcctgcccc 
tactgcgccc ccgccggctt cgccatcctg 
ggcccctgca ccaacgtgag caccgtgcag 
acccagctgc tgctgaacgg cagcctggcc 
ttcaccgaca acgccaagac catcatcgtg 
acccgcccca acaacaacac ccgcaagagc 
gccaccggcg acatcatcgg cgacatccgc 
tggaacaaca ccctgaagca gatcgtgacc 
atcgtgttca agcagagcag cggcggcgac 
ggcggcgagt tcttctactg caacagcacc 
atcggcccca acaacaccaa cggcaccatc 
aaccgcggcg gcaacaaggc catgtacgcc 
agcaacatca ccggcctgct gctgacccgc 
gagatcttcc gccccggcgg cggcgacatg 
tacaaggtgg tgaagatcga gcccctgggc 
gtgcagcgcg agaagcgcgc cgtgaccctg 
gccggcagca ccatgggcgc ccgcagcctg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accgccgtgc cctggaacgc cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgaggaga gccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
atcatgatcg tgggcggcct ggtgggcctg 
aaccgcgtgc gccagggcta cagccccctg 
ggccccgacc gccccgaggg catcgaggag 
agccccctgg tgcacggcct gctggccctg 
ttcagctacc accgcctgcg cgacctgatc 
ggccgccgcg gctgggaggc cctgaagtac 
gagctgaaga acagcgccgt gagcctgttc 
accgaccgca tcatcgaggt ggcccagcgc 
cgcatccgcc agggcttcga gcgcgccctg 

<210> 20 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 



ttctgcgcca 


gcgacgccaa 


ggcctacgac 


180 


gcctgcgtgc 


ccaccgaccc 


caacccccag 


240 


ttcaacatgt 


ggaagaacaa 


catggtggag 


300 


gaccagagcc 


tgaagccctg 


cgtgaagctg 


360 


aaggtgagct 


tcgagcccat 


ccccatccac 


420 


aagtgcaacg 


acaagaagtt 


caacggcagc 


480 


tgcacccacg 


gcatccgccc 


cgtggtgagc 


540 


gaggagggcg 


tggtgatccg 


cagcgagaac 


600 


cagctgaagg 


agagcgtgga 


gatcaactgc 


660 


atcaccatcg 


gccccggccg 


cgccttctac 


720 


caggcccact 


gcaacatcag 


cggcgagaag 


780 


aagctgcagg 


cccagttcgg 


caacaagacc 


840 


cccgagatcg 


tgatgcacag 


cttcaactgc 


900 


cagctgttca 


acagcacctg 


gaacaacacc 


960 


accctgccct 


gccgcatcaa 


gcagatcatc 


1020 


ccccccatcc 


gcggccagat 


ccgctgcagc 


1080 


gacggcggca 


aggagatcag 


caacaccacc 


1140 


cgcgacaact 


ggcgcagcga 


gctgtacaag 


12 0 0 


gtggccccca 


ccaaggccaa 


gcgccgcgtg 


1260 


ggcgccatgt 


tcctgggctt 


cctgggcgcc 


1320 


accctgaccg 


tgcaggcccg 


ccagctgctg 


1380 


ctgcgcgcca 


tcgaggccca 


gcagcacctg 


1440 


ctgcaggccc 


gcgtgctggc 


cgtggagcgc 


1500 


tggggctgca 


gcggcaagct 


gatctgcacc 


1560 


aacaagagcc 


tggaccagat 


ctggaacaac 


1620 


gacaactaca 


ccaacctgat 


ctacaccctg 


1680 


aacgagcagg 


agctgctgga 


gctggacaag 


1740 


agcaagtggc 


tgtggtacat 


caagatcttc 


1800 


cgcatcgtgt 


tcaccgtgct 


gagcatcgtg 


1860 


agcttccaga 


cccgcttccc 


cgccccccgc 




gagggcggcg 


agcgcgaccg 


cgaccgcagc 


1980 


atctgggacg 


acctgcgcag 


cctgtgcctg 


2040 


ctgatcgccg 


cccgcatcgt 


ggagctgctg 


2100 


tggggcaacc 


tgctgcagta 


ctggatccag 


2160 


gacgccatcg 


ccatcgccgt 


ggccgagggc 


2220 


atcggccgcg 


ccttcctgca catcccccgc 


2280 


ctgtaactcg 


ag 




2322 



<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 ; 
Trp427-Gly431 



<400> 20 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
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acccgcccca acaacaacac ccgcaagagc 
gccaccggcg acatcatcgg cgacatccgc 
tggaacaaca ccctgaagca gatcgtgacc 
atcgtgttca agcagagcag cggcggcgac 
ggcggcgagt tcttctactg caacagcacc 
atcggcccca acaacaccaa cggcaccatc 
aaccgctggg gcggcaaggc catgtacgcc 
agcaacatca ccggcctgct gctgacccgc 
gagatcttcc gccccggcgg cggcgacatg 
tacaaggtgg tgaagatcga gcccctgggc 
gtgcagcgcg agaagcgcgc cgtgaccctg 
gccggcagca ccatgggcgc ccgcagcctg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accgccgtgc cctggaacgc cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgaggaga gccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
atcatgatcg tgggcggcct ggtgggcctg 
aaccgcgtgc gccagggcta cagccccctg 
ggccccgacc gccccgaggg catcgaggag 
agccccctgg tgcacggcct gctggccctg 
ttcagctacc accgcctgcg cgacctgatc 
ggccgccgcg gctgggaggc cctgaagtac 
gagctgaaga acagcgccgt gagcctgttc 
accgaccgca tcatcgaggt ggcccagcgc 
cgcatccgcc agggcttcga gcgcgccctg 



atcaccatcg 


gccccggccg 


cgccttctac 


720 


caggcccacr 


gcaacatcag 


cggcgagaag 


780 


aagctgcagg 


cccagttcgg 




840 


cccgagatcg 


tgatgcacag 


cttcaactgc 


900 


cagctgttca 


acagcacctg gaacaacacc 


960 


accctgccct 


gccgcatcaa 


gcagatcatc 


1020 


ccccccatcc 


gcggccagat 


ccgctgcagc 


1080 


gacggcggca 


aggagatcag 


caacaccacc 


1140 


cgcgacaact 


ggcgcagcga 


gctgtacaag 


1200 


gtggccccca 


ccaaggccaa 


gcgccgcgtg 


1260 


ggcgccatgt 


tcctgggctt 


cctgggcgcc 


1320 


accctgaccg 


tgcaggcccg 


ccagctgctg 


1380 


ctgcgcgcca 


tcgaggccca 


gcagcacctg 


1440 


ctgcaggccc 


gcgtgctggc 


cgtggagcgc 


1500 


tggggctgca 


gcggcaagct 


gatctgcacc 


1560 


aacaagagcc 


tggaccagat 


ctggaacaac 


1620 


gacaactaca 


ccaacctgat 


ctacaccctg 


1680 


aacgagcagg 


agctgctgga 


gctggacaag 


1740 


agcaagtggc 


tgtggtacat 


caagatcttc 


1800 


cgcatcgtgt 


tcaccgtgct 


gagcatcgtg 


1860 


agcttccaga 


cccgcttccc 


cgccccccgc 




gagggcggcg 


agcgcgaccg 


cgaccgcagc 


1980 


atctgggacg 


acctgcgcag 


cctgtgcctg 


2040 


ctgatcgccg 


cccgcatcgt ggagctgctg 


2100 


tggggcaacc 


tgctgcagta 


ctggatccag 


2160 


gacgccatcg 


ccatcgccgt 


ggccgagggc 


2220 


atcggccgcg 


ccttcctgca 


catcccccgc 


2280 


ctgtaactcg 


ag 




2322 



<210> 21 

<211> 2310 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Lysl21-Val200; 
Asn42 5-Lys4 3 2 



<400> 21 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaaggcc 360 
cccgtgatca cccaggcctg ccccaaggtg agcttcgagc ccatccccat ccactactgc 420 
gcccccgccg gcttcgccat cctgaagtgc aacgacaaga agttcaacgg cagcggcccc 480 
tgcaccaacg tgagcaccgt gcagtgcacc cacggcatcc gccccgtggt gagcacccag 540 
ctgctgctga acggcagcct ggccgaggag ggcgtggtga tccgcagcga gaacttcacc 600 
gacaacgcca agaccatcat cgtgcagctg aaggagagcg tggagatcaa ctgcacccgc 660 
cccaacaaca acacccgcaa gagcatcacc atcggccccg gccgcgcctt ctacgccacc 720 
ggcgacatca tcggcgacat ccgccaggcc cactgcaaca tcagcggcga gaagtggaac 780 
aacaccctga agcagatcgt gaccaagctg caggcccagt tcggcaacaa gaccatcgtg 840 
ttcaagcaga gcagcggcgg cgaccccgag atcgtgatgc acagcttcaa ctgcggcggc 900 
gagttcttct actgcaacag cacccagctg ttcaacagca cctggaacaa caccatcggc 96 0 
cccaacaaca ccaacggcac catcaccctg ccctgccgca tcaagcagat catcaacgcc 1020 
cccaaggcca tgtacgcccc ccccatccgc ggccagatcc gctgcagcag caacatcacc 1080 
ggcctgctgc tgacccgcga cggcggcaag gagatcagca acaccaccga gatcttccgc 114 0 
cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 1200 
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aagatcgagc ccctgggcgt ggcccccacc 
aagcgcgccg tgaccctggg cgccatgttc 
atgggcgccc gcagcctgac cctgaccgtg 
cagcagcaga acaacctgct gcgcgccatc 
gtgtggggca tcaagcagct gcaggcccgc 
cagcagctgc tgggcatctg gggctgcagc 
tggaacgcca gctggagcaa caagagcctg 
gagtgggagc gcgagatcga caactacacc 
cagaaccagc aggagaagaa cgagcaggag 
tggaactggt tcgacatcag caagtggctg 
ggcggcctgg tgggcctgcg catcgtgttc 
cagggctaca gccccctgag cttccagacc 
cccgagggca tcgaggagga gggcggcgag 
cacggcctgc tggccctgat ctgggacgac 
cgcctgcgcg acctgatcct gatcgccgcc 
tgggaggccc tgaagtactg gggcaacctg 
agcgccgtga gcctgttcga cgccatcgcc 
atcgaggtgg cccagcgcat cggccgcgcc 
ggcttcgagc gcgccctgct gtaactcgag 



aaggccaagc 


gccgcgtggt 


gcagcgcgag 


1260 


ctgggcttcc 


tgggcgccgc 


cggcagcacc 


1320 


caggcccgcc 


agctgctgag 


cggcatcgtg 


1380 


gaggcccagc 


agcacctgct 


gcagctgacc 


1440 


gtgctggccg 


tggagcgcta 


cctgaaggac 


1500 


ggcaagctga 


tctgcaccac 


cgccgtgccc 


1560 


gaccagatct 


ggaacaacat 


gacctggatg 


1620 


aacctgatct 


acaccctgat 


cgaggagagc 


1680 


ctgctggagc 


tggacaagtg 


ggccagcctg 


1740 


tggtacatca 


agatcttcat 


catgatcgtg 


1800 


accgtgctga 


gcatcgtgaa 


ccgcgtgcgc 


1860 


cgcttccccg 


ccccccgcgg 


ccccgaccgc 




cgcgaccgcg 


accgcagcag 


ccccctggtg 


1980 


ctgcgcagcc 


tgtgcctgtt 


cagctaccac 


2040 


cgcatcgtgg 


agctgctggg 


ccgccgcggc 


2100 


ctgcagtact 


ggatccagga 


gctgaagaac 


2160 


atcgccgtgg 


ccgagggcac 


cgaccgcatc 


2220 


ttcctgcaca 


tcccccgccg 


catccgccag 


2280 








2310 



<210> 22 

<211> 2298 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall20-Ile201 ; 
Ile424-Ala433 



<400> 22 

gaattcgcca ccatggatgc 
gcagtcttcg tttcgcccag 
cccgtgtgga aggaggccac 
accgaggtgc acaacgtgtg 
gagatcgtgc tggagaacgt 
cagatgcacg aggacatcat 
atcacccagg cctgccccaa 
gccggcttcg ccatcctgaa 
aacgtgagca ccgtgcagtg 
ctgaacggca gcctggccga 
gccaagacca tcatcgtgca 
aacaacaccc gcaagagcat 
atcatcggcg acatccgcca 
ctgaagcaga tcgtgaccaa 
cagagcagcg gcggcgaccc 
ttctactgca acagcaccca 
aacaccaacg gcaccatcac 
tacgcccccc ccatccgcgg 
acccgcgacg gcggcaagga 
gacatgcgcg acaactggcg 
ctgggcgtgg cccccaccaa 
accctgggcg ccatgttcct 
agcctgaccc tgaccgtgca 
aacctgctgc gcgccatcga 
aagcagctgc aggcccgcgt 
ggcatctggg gctgcagcgg 
tggagcaaca agagcctgga 
gagatcgaca actacaccaa 
gagaagaacg agcaggagct 



aatgaagaga gggctctgct 
cgccgtggag aagctgtggg 
caccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
ggtgagcttc gagcccatcc 
gtgcaacgac aagaagttca 
cacccacggc atccgccccg 
ggagggcgtg gtgatccgca 
gctgaaggag agcgtggaga 
caccatcggc cccggccgcg 
ggcccactgc aacatcagcg 
gctgcaggcc cagttcggca 
cgagatcgtg atgcacagct 
gctgttcaac agcacctgga 
cctgccctgc cgcatcaagc 
ccagatccgc tgcagcagca 
gatcagcaac accaccgaga 
cagcgagctg tacaagtaca 
ggccaagcgc cgcgtggtgc 
gggcttcctg ggcgccgccg 
ggcccgccag ctgctgagcg 
ggcccagcag cacctgctgc 
gctggccgtg gagcgctacc 
caagctgatc tgcaccaccg 
ccagatctgg aacaacatga 
cctgatctac accctgatcg 
gctggagctg gacaagtggg 



gtgtgctgct gctgtgtgga SO 
tgaccgtgta ctacggcgtg 12 0 
gcgacgccaa ggcctacgac 180 
ccaccgaccc caacccccag 240 
ggaagaacaa catggtggag 3 00 
tgaagccctg cgtgggcggc 360 
ccatccacta ctgcgccccc 420 
acggcagcgg cccctgcacc 480 
tggtgagcac ccagctgctg 54 0 
gcgagaactt caccgacaac 600 
tcaactgcac ccgccccaac 660 
ccttctacgc caccggcgac 720 
gcgagaagtg gaacaacacc 780 
acaagaccat cgtgttcaag 840 
tcaactgcgg cggcgagttc 900 
acaacaccat cggccccaac 960 
agatcatcgg cggcgccatg 102 0 
acatcaccgg cctgctgctg 1080 
tcttccgccc cggcggcggc 114 0 
aggtggtgaa gatcgagccc 12 0 0 
agcgcgagaa gcgcgccgtg 1260 
gcagcaccat gggcgcccgc 1320 
gcatcgtgca gcagcagaac 13 80 
agctgaccgt gtggggcatc 144 0 
tgaaggacca gcagctgctg 1500 
ccgtgccctg gaacgccagc 1560 
cctggatgga gtgggagcgc 162 0 
aggagagcca gaaccagcag 16 8 0 
ccagcctgtg gaactggttc 174 0 
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gacatcagca 


agtggctgtg 


gtacatcaag 




tgatcgtggg 


cggcctggtg 


1800 


ggcctgcgca 


tcgtgttcac 


cgtgctgagc 


atcgtgaacc 


gcgtgcgcca 


gggctacagc 


1860 


cccctgagct 


tccagacccg 


cttccccgcc 


ccccgcggcc 


ccgaccgccc 


cgagggcatc 


1920 


gaggaggagg 


gcggcgagcg 


cgaccgcgac 


cgcagcagcc 


ccctggtgca 


cggcctgctg 


1980 


gccctgatct 


gggacgacct 


gcgcagcctg 


tgcctgttca 


gctaccaccg 


cctgcgcgac 


2040 


ctgatcctga 


tcgccgcccg 


catcgtggag 


ctgctgggcc 


gccgcggctg 


ggaggccctg 


2100 


aagtactggg 


gcaacctgct 


gcagtactgg atccaggagc 


tgaagaacag 


cgccgtgagc 


2160 


ctgttcgacg 


ccatcgccat 


cgccgtggcc 


gagggcaccg 


accgcatcat 


cgaggtggcc 




cagcgcatcg 


gccgcgcctt 


cctgcacatc 


ccccgccgca 


tccgccaggg 


cttcgagcgc 


2280 


gccctgctgt 


aactcgag 
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<210> 23 














<211> 2298 














<212> DNA 














<213> Artificial Sequence 











<220> 

<223> Description of Artificial Sequence: 
Vall20-Ile201B; Ile424 -Ala433 



<400> 23 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 18 0 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgcccggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1140 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 12 00 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1320 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 13 80 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1440 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1740 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1920 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 198 0 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2040 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 
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gccctgctgt aactcgag 2298 

<210> 24 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 ; 
Ile424-Ala433 



<400> 24 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
gccacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
tacgcccccc ccatccgcgg ccagatccgc 
acccgcgacg gcggcaagga gatcagcaac 
gacatgcgcg acaactggcg cagcgagctg 
ctgggcgtgg cccccaccaa ggccaagcgc 
accctgggcg ccatgttcct gggcttcctg 
agcctgaccc tgaccgtgca ggcccgccag 
aacctgctgc gcgccatcga ggcccagcag 
aagcagctgc aggcccgcgt gctggccgtg 
ggcatctggg gctgcagcgg caagctgatc 
tggagcaaca agagcctgga ccagatctgg 
gagatcgaca actacaccaa cctgatctac 
gagaagaacg agcaggagct gctggagctg 
gacatcagca agtggctgtg gtacatcaag 
ggcctgcgca tcgtgttcac cgtgctgagc 
cccctgagct tccagacccg cttccccgcc 
gaggaggagg gcggcgagcg cgaccgcgac 
gccctgatct gggacgacct gcgcagcctg 
ctgatcctga tcgccgcccg catcgtggag 
aagtactggg gcaacctgct gcagtactgg 
ctgttcgacg ccatcgccat cgccgtggcc 
cagcgcatcg gccgcgcctt cctgcacatc 
gccctgctgt aactcgag 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcggc 360 
gagcccatcc ccatccacta ctgcgccccc 420 
aagaagttca acggcagcgg cccctgcacc 480 
atccgccccg tggtgagcac ccagctgctg 540 
gtgatccgca gcgagaactt caccgacaac 600 
agcgtggaga tcaactgcac ccgccccaac 6S0 
cccggccgcg ccttctacgc caccggcgac 72 0 
aacatcagcg gcgagaagtg gaacaacacc 780 
cagttcggca acaagaccat cgtgttcaag 84 0 
atgcacagct tcaactgcgg cggcgagttc 900 
agcacctgga acaacaccat cggccccaac 960 
cgcatcaagc agatcatcgg cggcgccatg 1020 
tgcagcagca acatcaccgg cctgctgctg 1080 
accaccgaga tcttccgccc cggcggcggc 1140 
tacaagtaca aggtggtgaa gatcgagccc 12 0 0 
cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
ggcgccgccg gcagcaccat gggcgcccgc 132 0 
ctgctgagcg gcatcgtgca gcagcagaac 1380 
cacctgctgc agctgaccgt gtggggcatc 1440 
gagcgctacc tgaaggacca gcagctgctg 1500 
tgcaccaccg ccgtgccctg gaacgccagc 1560 
aacaacatga cctggatgga gtgggagcgc 162 0 
accctgatcg aggagagcca gaaccagcag 1680 
gacaagtggg ccagcctgtg gaactggttc 1740 
atcttcatca tgatcgtggg cggcctggtg 1800 
atcgtgaacc gcgtgcgcca gggctacagc 1860 
ccccgcggcc ccgaccgccc cgagggcatc 1920 
cgcagcagcc ccctggtgca cggcctgctg 1980 
tgcctgttca gctaccaccg cctgcgcgac 2040 
ctgctgggcc gccgcggctg ggaggccctg 2100 
atccaggagc tgaagaacag cgccgtgagc 2160 
gagggcaccg accgcatcat cgaggtggcc 222 0 
ccccgccgca tccgccaggg cttcgagcgc 2280 
2298 



<210> 25 

<211> 2358 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 
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<400> 25 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 42 0 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 54 0 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 72 0 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1380 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1620 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 16 8 0 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1860 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1920 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1980 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 2040 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2100 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2160 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2220 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 22 80 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2340 
gccctgctgt aactcgag 2358 

<210> 26 
<211> 2352 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 ; 
Arg42 6-Gly431 

<400> 26 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 42 0 
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aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 
atcaccatcg gccccggccg cgccttctac 
caggcccact gcaacatcag cggcgagaag 
aagctgcagg cccagttcgg caacaagacc 
cccgagatcg tgatgcacag cttcaactgc 
cagctgttca acagcacctg gaacaacacc 
accctgccct gccgcatcaa gcagatcatc 
ccccccatcc gcggccagat ccgctgcagc 
gacggcggca aggagatcag caacaccacc 
cgcgacaact ggcgcagcga gctgtacaag 
gtggccccca ccaaggccaa gcgccgcgtg 
ggcgccatgt tcctgggctt cctgggcgcc 
accctgaccg tgcaggcccg ccagctgctg 
ctgcgcgcca tcgaggccca gcagcacctg 
ctgcaggccc gcgtgctggc cgtggagcgc 
tggggctgca gcggcaagct gatctgcacc 
aacaagagcc tggaccagat ctggaacaac 
gacaactaca ccaacctgat ctacaccctg 
aacgagcagg agctgctgga gctggacaag 
agcaagtggc tgtggtacat caagatcttc 
cgcatcgtgt tcaccgtgct gagcatcgtg 
agcttccaga cccgcttccc cgccccccgc 
gagggcggcg agcgcgaccg cgaccgcagc 
atctgggacg acctgcgcag cctgtgcctg 
ctgatcgccg cccgcatcgt ggagctgctg 
tggggcaacc tgctgcagta ctggatccag 
gacgccatcg ccatcgccgt ggccgagggc 
atcggccgcg ccttcctgca catcccccgc 
ctgtaactcg ag 



-actgcgccc 


ccgccggctt 


cgccatcctg 


480 


ggcccctqca 


ccaacgtgag 


caccgtgcag 


540 


acccagctgc 


tgctgaacgg 


cagcctggcc 


600 


ttcaccgaca 


acgccaagac 


catcatcgtg 


660 


acccgcccca 


acaacaacac 


ccgcaagagc 


720 


gccaccggcg 


acatcatcgg 


cgacatccgc 


780 


tggaacaaca 


ccctgaagca 


gatcgtgacc 


840 


atcgtgttca 


agcagagcag 


cggcggcgac 


900 


ggcggcgagt 


tcttctactg 


caacagcacc 


960 


atcggcccca 


acaacaccaa 


cggcaccatc 


1020 


aaccgcggcg 


gcggcaaggc 


catgtacgcc 


1080 


agcaacatca 


ccggcctgct 


gctgacccgc 


1140 


gagatcttcc 


gccccggggg 


cggcgacatg 


1200 


tacaaggtgg 


tgaagatcga 


gcccctgggc 


1260 


gtgcagcgcg 


agaagcgcgc 


cgtgaccctg 


1320 


gccggcagca 


ccatgggcgc 


ccgcagcctg 


1380 


agcggcatcg 


tgcagcagca 


gaacaacctg 


1440 


ctgcagctga 


ccgtgtgggg 


catcaagcag 


1500 


tacctgaagg 


accagcagct 


gctgggcatc 


1560 


accgccgtgc 


cctggaacgc 


cagctggagc 


1620 


atgacctgga 


tggagtggga 


gcgcgagatc 


1680 


atcgaggaga 


gccagaacca 


gcaggagaag 


1740 


tgggccagcc 


tgtggaactg 


gttcgacatc 


1800 


atcatgatcg 


tgggcggcct 


ggtgggcctg 


1860 


aaccgcgtgc 


gccagggcta 


cagccccctg 




ggccccgacc 


gccccgaggg 


catcgaggag 


1980 


agccccctgg 


tgcacggcct 


gctggccctg 


2040 


ttcagctacc 


accgcctgcg 


cgacctgatc 


2100 


ggccgccgcg 


gctgggaggc 


cctgaagtac 


2160 


gagctgaaga 


acagcgccgt 


gagcctgttc 


2220 


accgaccgca 


tcatcgaggt 


ggcccagcgc 


2280 


cgcatccgcc 


agggcttcga gcgcgccctg 


2340 



2352 
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